ODC and Dask (LocalCluster)

This notebook explores the use of ODC with Dask LocalCluster. The goal is to introduce fundamental concepts and the role Dask can serve with datacube and subsequent computation using xarray.

The example computation is fairly typical of an EO data processing pipeline. We'll be using a small area and time period to start with and progressively scaling this example. EO scientists may find some aspects of these examples unrealistic, but this isn't an EO science course. :-).

For the base example we'll be using the Australian island state of Tasmania as our Region of Interest (ROI). Intially a paddock size, and progressively increasing to the entire island. The basic algorithm is:

  1. Specify Region of Interest, Satellite product, EO satellite bands, Time range, desired CRS for the datacube query
  2. Load data using datacube.load()
  3. Mask valid data
  4. Visualisation of the ROI
  5. Compute NDVI
  6. Visualise NDVI

Some cells in this notebook will take minutes to run so be patient

In [1]:
# EASI tools
import git
import sys, os
os.environ['USE_PYGEOS'] = '0'
repo = git.Repo('.', search_parent_directories=True).working_tree_dir
if repo not in sys.path: sys.path.append(repo)
from easi_tools import EasiDefaults, notebook_utils
easi = EasiDefaults()
Successfully found configuration for deployment "chile"
In [2]:
import datacube
from datacube.utils import masking

The next cell sets out all the query parameters used in our datacube.load(). For this run we keep the ROI quite small.

In [3]:
# Get the default latitude & longitude extents
study_area_lat = easi.latitude
study_area_lon = easi.longitude

# Or choose your own by uncommenting and modifying this section
###############################################################
# # Central Tasmania (near Little Pine Lagoon)
# central_lat = -42.019
# central_lon = 146.615

# # Set the buffer to load around the central coordinates
# # This is a radial distance for the bbox to actual area so bbox 2x buffer in both dimensions
# buffer = 0.05

# # Compute the bounding box for the study area
# study_area_lat = (central_lat - buffer, central_lat + buffer)
# study_area_lon = (central_lon - buffer, central_lon + buffer)
###############################################################

# Data product
product = easi.product('landsat')
# product = 'landsat8_c2l2_sr'

# Set the date range to load data over
set_time = easi.time
# set_time = ("2021-01-01", "2021-01-31")

# Set the measurements/bands to load. None will load all of them
measurements = None

# Set the coordinate reference system and output resolution
set_crs = easi.crs('landsat')  # If defined, else None
set_resolution = easi.resolution('landsat')  # If defined, else None
# set_crs = "epsg:3577"
# set_resolution = (-30, 30)

group_by = "solar_day"

Now initialise the datacube.

In [4]:
dc = datacube.Datacube()

# Access AWS "requester-pays" buckets
# This is necessary for reading data from most third-party AWS S3 buckets such as for Landsat and Sentinel-2
from datacube.utils.aws import configure_s3_access
configure_s3_access(aws_unsigned=False, requester_pays=True);

Now load the data. This first dc.load() does not use Dask, so it will take a little bit of time.

We use %%time to keep track of how long things take to complete.

In [5]:
%%time
dataset = None # clear results from any previous runs
dataset = dc.load(
            product=product,
            x=study_area_lon,
            y=study_area_lat,
            time=set_time,
            measurements=measurements,
            resampling={"qa_pixel": "nearest", "*": "average"},
            output_crs=set_crs,
            resolution=set_resolution,
            group_by=group_by,
        )
CPU times: user 2.2 s, sys: 4.78 s, total: 6.98 s
Wall time: 20.1 s

The result of the datacube.load() function is an xarray.Dataset. The notebook can be used to render a description of the dataset variable as an html block with a lot of useful information about the structure of data. If you open up the Data variables (click the > Data variables) and click on the stacked cylinders for one of them you will see the actual data array is available and shown in summary form.

NOTE that you can see real numbers in the array when you do this. This will change when we start using Dask.

This visualisation will become increasingly importantly when dask is enabled and as scale out occurs so take a moment now to just poke around the interface. Depending on your area of interest set above, you should have a relatively small area (perhaps around 300 to 400 pixels in each of the x abd y dimensions) and perhaps up to 10 time slices. This is a relatively small scale and fine to do without using Dask.

In [6]:
dataset
Out[6]:
<xarray.Dataset>
Dimensions:      (time: 6, y: 381, x: 335)
Coordinates:
  * time         (time) datetime64[ns] 2022-02-07T14:39:10.740819 ... 2022-04...
  * y            (y) float64 6.692e+06 6.692e+06 ... 6.681e+06 6.681e+06
  * x            (x) float64 8.572e+05 8.572e+05 ... 8.672e+05 8.672e+05
    spatial_ref  int32 32718
Data variables:
    coastal      (time, y, x) uint16 41267 40757 40797 41270 ... 9488 9355 9121
    blue         (time, y, x) uint16 41290 40840 40834 41300 ... 10065 9930 9576
    green        (time, y, x) uint16 39954 39505 39326 ... 11160 11214 10860
    red          (time, y, x) uint16 39881 39380 39245 ... 12368 12408 11827
    nir08        (time, y, x) uint16 38927 38419 38287 ... 13923 14054 13285
    swir16       (time, y, x) uint16 28605 28092 27987 ... 16652 16490 15362
    swir22       (time, y, x) uint16 21633 21250 21185 ... 15800 15671 14830
    qa_pixel     (time, y, x) uint16 22280 22280 22280 ... 21824 21824 21824
    qa_aerosol   (time, y, x) uint8 204 220 224 208 221 224 ... 96 96 96 96 96
    qa_radsat    (time, y, x) uint16 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
Attributes:
    crs:           epsg:32718
    grid_mapping:  spatial_ref
xarray.Dataset
    • time: 6
    • y: 381
    • x: 335
    • time
      (time)
      datetime64[ns]
      2022-02-07T14:39:10.740819 ... 2...
      units :
      seconds since 1970-01-01 00:00:00
      array(['2022-02-07T14:39:10.740819000', '2022-02-23T14:39:05.143152000',
             '2022-03-11T14:39:01.493587000', '2022-03-27T14:38:49.719285000',
             '2022-04-12T14:38:53.220719000', '2022-04-28T14:38:48.578403000'],
            dtype='datetime64[ns]')
    • y
      (y)
      float64
      6.692e+06 6.692e+06 ... 6.681e+06
      units :
      metre
      resolution :
      -30.0
      crs :
      epsg:32718
      array([6692085., 6692055., 6692025., ..., 6680745., 6680715., 6680685.])
    • x
      (x)
      float64
      8.572e+05 8.572e+05 ... 8.672e+05
      units :
      metre
      resolution :
      30.0
      crs :
      epsg:32718
      array([857175., 857205., 857235., ..., 867135., 867165., 867195.])
    • spatial_ref
      ()
      int32
      32718
      spatial_ref :
      PROJCS["WGS 84 / UTM zone 18S",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-75],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",10000000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","32718"]]
      grid_mapping_name :
      transverse_mercator
      array(32718, dtype=int32)
    • coastal
      (time, y, x)
      uint16
      41267 40757 40797 ... 9355 9121
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[41267, 40757, 40797, ..., 43857, 44043, 44264],
              [41606, 41183, 40852, ..., 43789, 43977, 44146],
              [41646, 41479, 41152, ..., 43732, 44025, 44276],
              ...,
              [36111, 35997, 35731, ..., 42345, 42340, 42353],
              [36058, 35959, 35664, ..., 42440, 42389, 42539],
              [36244, 36212, 35848, ..., 42469, 42411, 42522]],
      
             [[ 7499,  7499,  7490, ...,  6799,  7756,  6611],
              [ 7494,  7508,  7497, ...,  4325,  6205,  6354],
              [ 7493,  7500,  7493, ...,  2446,  4745,  5726],
              ...,
              [ 9765, 10044, 10713, ...,  8241,  8155,  8066],
              [10962, 10749, 10642, ...,  8591,  8466,  8395],
              [10874, 10627, 10461, ...,  8534,  8733,  9054]],
      
             [[14783, 13710, 15086, ...,  9260,  9371,  9215],
              [12080, 11886, 15179, ...,  9247,  9164,  8972],
              [10170, 12270, 15012, ...,  9086,  8958,  8822],
              ...,
      ...
              ...,
              [ 9860,  9876, 10610, ...,  8862,  8720,  8509],
              [10739, 10603, 10499, ...,  9593,  9150,  8728],
              [11108, 10607, 10332, ...,  9528,  9357,  9244]],
      
             [[38491, 38068, 38159, ..., 39114, 38513, 37970],
              [38773, 37829, 38056, ..., 38900, 38335, 37952],
              [38530, 37297, 37858, ..., 38733, 38503, 38452],
              ...,
              [41287, 41228, 41151, ..., 42876, 42981, 43415],
              [41713, 41551, 41349, ..., 42986, 43039, 43597],
              [41491, 41588, 41585, ..., 42939, 43052, 43687]],
      
             [[ 7345,  7332,  7309, ...,  9008,  9014,  8910],
              [ 7320,  7333,  7315, ...,  9031,  8929,  8773],
              [ 7329,  7315,  7304, ...,  8938,  8831,  8681],
              ...,
              [ 9881,  9989, 10545, ...,  8956,  8627,  8287],
              [11044, 10789, 10543, ...,  9606,  9096,  8552],
              [10888, 10492, 10195, ...,  9488,  9355,  9121]]], dtype=uint16)
    • blue
      (time, y, x)
      uint16
      41290 40840 40834 ... 9930 9576
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[41290, 40840, 40834, ..., 43903, 44121, 44381],
              [41671, 41265, 40894, ..., 43831, 44030, 44219],
              [41685, 41531, 41172, ..., 43757, 44032, 44315],
              ...,
              [35880, 35766, 35440, ..., 42324, 42345, 42314],
              [35840, 35786, 35463, ..., 42395, 42425, 42532],
              [36054, 36014, 35622, ..., 42429, 42433, 42489]],
      
             [[ 7554,  7542,  7530, ...,  7877,  8793,  7646],
              [ 7535,  7532,  7529, ...,  5727,  7388,  7390],
              [ 7542,  7530,  7527, ...,  4243,  6082,  6750],
              ...,
              [10612, 10886, 11742, ...,  8368,  8246,  8129],
              [12204, 11891, 11610, ...,  8763,  8563,  8486],
              [11946, 11591, 11327, ...,  8738,  8939,  9278]],
      
             [[15534, 14388, 15592, ...,  9699,  9796,  9649],
              [12628, 12488, 15670, ...,  9689,  9568,  9315],
              [10526, 12746, 15493, ...,  9515,  9363,  9182],
              ...,
      ...
              ...,
              [10659, 10711, 11480, ...,  9368,  9124,  8816],
              [11911, 11661, 11495, ..., 10096,  9622,  9060],
              [12051, 11520, 11219, ...,  9970,  9899,  9764]],
      
             [[38352, 37950, 38006, ..., 39212, 38609, 38121],
              [38630, 37713, 37949, ..., 39054, 38486, 38127],
              [38455, 37134, 37740, ..., 38972, 38672, 38567],
              ...,
              [41216, 41111, 41061, ..., 42785, 42830, 43319],
              [41693, 41552, 41372, ..., 42943, 42915, 43535],
              [41537, 41628, 41607, ..., 42850, 42835, 43567]],
      
             [[ 7596,  7579,  7563, ...,  9447,  9451,  9301],
              [ 7570,  7586,  7558, ...,  9458,  9316,  9107],
              [ 7546,  7563,  7549, ...,  9348,  9226,  9026],
              ...,
              [10672, 10745, 11444, ...,  9218,  8909,  8538],
              [11873, 11617, 11326, ..., 10118,  9489,  8864],
              [11782, 11363, 11028, ..., 10065,  9930,  9576]]], dtype=uint16)
    • green
      (time, y, x)
      uint16
      39954 39505 39326 ... 11214 10860
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[39954, 39505, 39326, ..., 42411, 42622, 42884],
              [40262, 39858, 39447, ..., 42350, 42549, 42735],
              [40238, 40065, 39788, ..., 42271, 42552, 42869],
              ...,
              [34510, 34456, 34190, ..., 40795, 40838, 40827],
              [34551, 34569, 34294, ..., 40917, 40940, 41040],
              [34812, 34802, 34389, ..., 40959, 40975, 41007]],
      
             [[ 7710,  7698,  7691, ...,  9359, 10082,  9198],
              [ 7700,  7690,  7696, ...,  7975,  9072,  9080],
              [ 7700,  7693,  7700, ...,  7229,  8267,  8607],
              ...,
              [12426, 12935, 13965, ...,  8822,  8686,  8464],
              [14162, 13881, 13434, ...,  9223,  9122,  9076],
              [13946, 13448, 12961, ...,  9273,  9768, 10489]],
      
             [[15998, 15106, 16741, ..., 11068, 11178, 10973],
              [13696, 13307, 16682, ..., 11164, 10827, 10413],
              [12031, 13694, 16388, ..., 10918, 10477, 10219],
              ...,
      ...
              ...,
              [12225, 12510, 13652, ..., 10808, 10343,  9738],
              [13994, 13595, 13335, ..., 11550, 10850, 10041],
              [13775, 13188, 12757, ..., 11289, 11181, 10950]],
      
             [[36774, 36237, 36347, ..., 38002, 37474, 36921],
              [37030, 35937, 36158, ..., 37832, 37275, 36860],
              [36831, 35588, 35946, ..., 37756, 37469, 37259],
              ...,
              [39714, 39571, 39492, ..., 41413, 41521, 42045],
              [40109, 40027, 39879, ..., 41568, 41570, 42259],
              [39884, 39988, 40048, ..., 41538, 41554, 42319]],
      
             [[ 7834,  7878,  7830, ..., 10728, 10835, 10616],
              [ 7818,  7860,  7832, ..., 10833, 10579, 10163],
              [ 7809,  7807,  7806, ..., 10710, 10391, 10059],
              ...,
              [12152, 12323, 13182, ..., 10452, 10077,  9394],
              [13449, 13202, 12859, ..., 11254, 10664,  9729],
              [13464, 12874, 12409, ..., 11160, 11214, 10860]]], dtype=uint16)
    • red
      (time, y, x)
      uint16
      39881 39380 39245 ... 12408 11827
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[39881, 39380, 39245, ..., 42340, 42526, 42869],
              [40218, 39833, 39361, ..., 42270, 42465, 42703],
              [40173, 40038, 39739, ..., 42185, 42489, 42808],
              ...,
              [34477, 34422, 34165, ..., 40782, 40822, 40813],
              [34515, 34492, 34217, ..., 40879, 40901, 41002],
              [34747, 34754, 34386, ..., 40898, 40912, 40951]],
      
             [[ 7390,  7387,  7374, ...,  9976, 10718,  9731],
              [ 7365,  7382,  7371, ...,  8641,  9756,  9628],
              [ 7365,  7376,  7376, ...,  7933,  8986,  9268],
              ...,
              [13588, 14108, 15014, ...,  9172,  8987,  8728],
              [15363, 14988, 14503, ...,  9543,  9443,  9403],
              [15242, 14764, 14145, ...,  9643, 10221, 11049]],
      
             [[16938, 16059, 17646, ..., 12262, 12575, 12313],
              [14707, 14319, 17538, ..., 12399, 12096, 11451],
              [13049, 14669, 17269, ..., 12141, 11618, 11137],
              ...,
      ...
              ...,
              [13293, 13461, 14778, ..., 11748, 11196, 10479],
              [14958, 14581, 14433, ..., 12683, 11868, 10842],
              [15065, 14377, 13902, ..., 12508, 12405, 11967]],
      
             [[36776, 36216, 36331, ..., 38124, 37689, 37208],
              [37073, 35996, 36211, ..., 38129, 37604, 37189],
              [36850, 35532, 36012, ..., 38028, 37726, 37571],
              ...,
              [39784, 39649, 39560, ..., 41544, 41612, 42156],
              [40157, 40056, 39913, ..., 41705, 41662, 42377],
              [39931, 40052, 40115, ..., 41620, 41636, 42435]],
      
             [[ 7356,  7379,  7358, ..., 11914, 12067, 11852],
              [ 7344,  7350,  7352, ..., 12208, 11789, 11246],
              [ 7325,  7323,  7337, ..., 12019, 11585, 11007],
              ...,
              [12960, 13187, 14107, ..., 11358, 10898, 10068],
              [14596, 14203, 13898, ..., 12454, 11681, 10467],
              [14523, 13948, 13424, ..., 12368, 12408, 11827]]], dtype=uint16)
    • nir08
      (time, y, x)
      uint16
      38927 38419 38287 ... 14054 13285
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[38927, 38419, 38287, ..., 41602, 41770, 42094],
              [39382, 38988, 38521, ..., 41527, 41717, 41952],
              [39290, 39151, 38866, ..., 41420, 41705, 42025],
              ...,
              [34104, 34077, 33848, ..., 40006, 40045, 40048],
              [34146, 34134, 33878, ..., 40126, 40147, 40249],
              [34373, 34419, 34103, ..., 40140, 40143, 40185]],
      
             [[ 7256,  7265,  7243, ..., 10942, 11642, 10605],
              [ 7243,  7247,  7238, ...,  9720, 10759, 10541],
              [ 7242,  7237,  7241, ...,  9210, 10095, 10238],
              ...,
              [15449, 16917, 17546, ...,  9423,  9186,  8978],
              [17546, 17129, 16157, ...,  9900,  9745,  9789],
              [16630, 16092, 15381, ..., 10175, 10991, 11874]],
      
             [[18444, 17652, 18939, ..., 14005, 14438, 14150],
              [16293, 16061, 18847, ..., 14240, 13853, 13058],
              [14680, 16288, 18619, ..., 13814, 13280, 12662],
              ...,
      ...
              ...,
              [15320, 16508, 17554, ..., 13027, 12272, 11499],
              [17137, 17048, 16358, ..., 13960, 13035, 11891],
              [16532, 15984, 15509, ..., 13786, 13902, 13463]],
      
             [[36263, 35659, 35719, ..., 37752, 37360, 36907],
              [36421, 35437, 35660, ..., 37769, 37313, 36952],
              [36258, 34940, 35433, ..., 37795, 37505, 37261],
              ...,
              [39242, 39144, 39047, ..., 40872, 40898, 41469],
              [39601, 39517, 39398, ..., 41074, 41011, 41740],
              [39395, 39565, 39648, ..., 40943, 40885, 41726]],
      
             [[ 7210,  7229,  7202, ..., 13657, 13968, 13651],
              [ 7205,  7226,  7204, ..., 13861, 13533, 12862],
              [ 7202,  7212,  7199, ..., 13643, 13261, 12626],
              ...,
              [14724, 16053, 16890, ..., 12401, 11649, 10882],
              [16827, 16556, 15749, ..., 13698, 12621, 11334],
              [15934, 15330, 14745, ..., 13923, 14054, 13285]]], dtype=uint16)
    • swir16
      (time, y, x)
      uint16
      28605 28092 27987 ... 16490 15362
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[28605, 28092, 27987, ..., 28532, 28825, 29087],
              [29097, 28791, 28294, ..., 28395, 28716, 29003],
              [29119, 28991, 28839, ..., 28324, 28821, 29317],
              ...,
              [25861, 25913, 25764, ..., 28841, 28896, 28890],
              [25900, 25942, 25744, ..., 28930, 28985, 29117],
              [26095, 26111, 25846, ..., 29023, 29027, 29113]],
      
             [[ 7355,  7360,  7335, ..., 11429, 12257, 11505],
              [ 7347,  7341,  7334, ..., 10454, 11395, 11335],
              [ 7339,  7329,  7329, ..., 10279, 10897, 10949],
              ...,
              [15577, 16261, 16381, ...,  9843,  9725,  9542],
              [17153, 16676, 16241, ..., 10398, 10406, 10658],
              [16839, 16785, 16492, ..., 11003, 11892, 13331]],
      
             [[18637, 18435, 19790, ..., 17489, 17885, 17529],
              [16806, 17278, 19667, ..., 17629, 16909, 15938],
              [15765, 17343, 19256, ..., 16798, 15923, 15306],
              ...,
      ...
              ...,
              [15374, 15843, 16420, ..., 15480, 14617, 13787],
              [16622, 16465, 16178, ..., 16841, 15487, 13940],
              [16636, 16694, 16580, ..., 16864, 16346, 15310]],
      
             [[26653, 25956, 25945, ..., 28326, 27952, 27523],
              [26834, 25793, 25858, ..., 28401, 27932, 27578],
              [26581, 25357, 25583, ..., 28366, 28106, 27969],
              ...,
              [28740, 28540, 28409, ..., 28346, 28607, 29289],
              [28987, 28920, 28844, ..., 28503, 28704, 29565],
              [28653, 28823, 29004, ..., 28466, 28787, 29789]],
      
             [[ 7323,  7346,  7333, ..., 16881, 17437, 17016],
              [ 7313,  7332,  7323, ..., 17268, 16819, 15914],
              [ 7320,  7319,  7321, ..., 16821, 16107, 15412],
              ...,
              [14683, 15214, 15830, ..., 14782, 14245, 13145],
              [16529, 16090, 15639, ..., 16415, 15212, 13334],
              [16166, 15982, 15732, ..., 16652, 16490, 15362]]], dtype=uint16)
    • swir22
      (time, y, x)
      uint16
      21633 21250 21185 ... 15671 14830
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[21633, 21250, 21185, ..., 21605, 21935, 22115],
              [22081, 21764, 21435, ..., 21503, 21821, 22054],
              [22170, 21990, 21860, ..., 21486, 21980, 22461],
              ...,
              [19511, 19569, 19485, ..., 21985, 22051, 22020],
              [19519, 19608, 19490, ..., 22083, 22144, 22279],
              [19642, 19698, 19512, ..., 22197, 22196, 22278]],
      
             [[ 7344,  7347,  7340, ..., 11026, 11769, 10947],
              [ 7340,  7339,  7336, ..., 10080, 10962, 10894],
              [ 7329,  7329,  7328, ...,  9729, 10309, 10366],
              ...,
              [14497, 14744, 15064, ...,  9476,  9372,  9290],
              [15969, 15552, 15253, ...,  9853,  9834, 10202],
              [15848, 15845, 15801, ..., 10148, 11085, 12614]],
      
             [[18118, 17828, 19041, ..., 15969, 16431, 16117],
              [16575, 16670, 18857, ..., 16152, 15499, 14628],
              [15457, 16699, 18519, ..., 15482, 14700, 13995],
              ...,
      ...
              ...,
              [14396, 14470, 14949, ..., 14564, 13856, 13172],
              [15556, 15334, 15286, ..., 15580, 14410, 13279],
              [15616, 15793, 15915, ..., 15685, 15297, 14696]],
      
             [[20679, 20156, 20089, ..., 22498, 22101, 21650],
              [20863, 19933, 20003, ..., 22559, 22059, 21696],
              [20705, 19680, 19747, ..., 22505, 22125, 21963],
              ...,
              [22699, 22541, 22518, ..., 22751, 22980, 23735],
              [23051, 23017, 22974, ..., 22919, 23172, 24103],
              [22725, 22944, 23169, ..., 22819, 23221, 24319]],
      
             [[ 7342,  7345,  7340, ..., 15852, 16290, 15873],
              [ 7328,  7342,  7333, ..., 16114, 15639, 14779],
              [ 7332,  7328,  7323, ..., 15674, 15086, 14298],
              ...,
              [13992, 14214, 14704, ..., 13928, 13397, 12583],
              [15184, 14944, 14807, ..., 15355, 14257, 12829],
              [15278, 15309, 15520, ..., 15800, 15671, 14830]]], dtype=uint16)
    • qa_pixel
      (time, y, x)
      uint16
      22280 22280 22280 ... 21824 21824
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'snow': {'bits': 5, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'clear': {'bits': 6, 'values': {'0': 'not_clear', '1': 'clear'}}, 'cloud': {'bits': 3, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'water': {'bits': 7, 'values': {'0': 'land_or_cloud', '1': 'water'}}, 'cirrus': {'bits': 2, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'nodata': {'bits': 0, 'values': {'0': False, '1': True}}, 'qa_pixel': {'bits': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15], 'values': {'1': 'Fill', '2': 'Dilated Cloud', '4': 'Cirrus', '8': 'Cloud', '16': 'Cloud Shadow', '32': 'Snow', '64': 'Clear', '128': 'Water', '256': 'Cloud Confidence low bit', '512': 'Cloud Confidence high bit', '1024': 'Cloud Shadow Confidence low bit', '2048': 'Cloud Shadow Confidence high bit', '4096': 'Snow Ice Confidence low bit', '8192': 'Snow Ice Confidence high bit', '16384': 'Cirrus Confidence low bit', '32768': 'Cirrus Confidence high bit'}, 'description': 'Level 2 pixel quality'}, 'cloud_shadow': {'bits': 4, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'dilated_cloud': {'bits': 1, 'values': {'0': 'not_dilated', '1': 'dilated'}}, 'cloud_confidence': {'bits': [8, 9], 'values': {'0': 'none', '1': 'low', '2': 'medium', '3': 'high'}}, 'cirrus_confidence': {'bits': [14, 15], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}, 'snow_ice_confidence': {'bits': [12, 13], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}, 'cloud_shadow_confidence': {'bits': [10, 11], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              ...,
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280]],
      
             [[21952, 21952, 21952, ..., 23888, 23888, 23888],
              [21952, 21952, 21952, ..., 23888, 23888, 23888],
              [21952, 21952, 21952, ..., 23888, 23888, 23888],
              ...,
              [21824, 21824, 21824, ..., 21952, 21952, 21952],
              [21824, 21824, 21824, ..., 21824, 21952, 21952],
              [21824, 21824, 21824, ..., 21824, 21824, 21824]],
      
             [[22280, 22280, 22280, ..., 21824, 21824, 21824],
              [22280, 22280, 22280, ..., 21824, 21824, 21824],
              [22280, 22280, 22280, ..., 21824, 21824, 21824],
              ...,
      ...
              ...,
              [23888, 23888, 23888, ..., 21824, 21824, 21824],
              [24144, 24144, 24144, ..., 21824, 21824, 21824],
              [24144, 24144, 24144, ..., 21824, 21824, 21824]],
      
             [[22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              ...,
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280]],
      
             [[21952, 21952, 21952, ..., 21824, 21824, 21824],
              [21952, 21952, 21952, ..., 21824, 21824, 21824],
              [21952, 21952, 21952, ..., 21824, 21824, 21824],
              ...,
              [21824, 21824, 21824, ..., 21824, 21824, 21824],
              [21824, 21824, 21824, ..., 21824, 21824, 21824],
              [21824, 21824, 21824, ..., 21824, 21824, 21824]]], dtype=uint16)
    • qa_aerosol
      (time, y, x)
      uint8
      204 220 224 208 221 ... 96 96 96 96
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'water': {'bits': 2, 'values': {'0': 'not_water', '1': 'water'}}, 'nodata': {'bits': 0, 'values': {'0': False, '1': True}}, 'qa_aerosol': {'bits': [0, 1, 2, 3, 4, 5, 6, 7], 'values': {'1': 'Fill', '2': 'Valid aerosol retrieval', '4': 'Water', '8': 'Unused', '16': 'Unused', '32': 'Interpolated Aerosol', '64': 'Aerosol Level low bit', '128': 'Aerosol Level high bit'}, 'description': 'Aerosol quality assessment'}, 'aerosol_level': {'bits': [6, 7], 'values': {'0': 'climatology', '1': 'low', '2': 'medium', '3': 'high'}}, 'valid_retrieval': {'bits': 1, 'values': {'0': 'not_valid', '1': 'valid'}}, 'interp_retrieval': {'bits': 5, 'values': {'0': 'not_aerosol_interpolated', '1': 'aerosol_interpolated'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[204, 220, 224, ..., 224, 223, 204],
              [218, 223, 224, ..., 224, 224, 224],
              [224, 224, 224, ..., 224, 222, 215],
              ...,
              [224, 224, 224, ..., 211, 220, 224],
              [224, 208, 214, ..., 212, 220, 224],
              [224, 221, 221, ..., 224, 224, 224]],
      
             [[ 81,  96, 100, ..., 224, 223, 204],
              [ 94,  99, 100, ..., 224, 224, 224],
              [100, 100, 100, ..., 224, 222, 215],
              ...,
              [217, 185, 160, ...,  83,  92,  96],
              [190, 149, 150, ...,  84,  92,  96],
              [183, 157, 158, ...,  96,  96,  96]],
      
             [[220, 224, 208, ...,  95,  79,  96],
              [223, 224, 215, ...,  96,  96,  96],
              [224, 224, 224, ...,  94,  86,  96],
              ...,
      ...
              ...,
              [220, 214, 183, ...,  96,  86,  93],
              [213, 224, 202, ...,  96,  84,  92],
              [166, 171, 170, ...,  96,  96,  96]],
      
             [[219, 224, 207, ..., 223, 206, 224],
              [223, 224, 215, ..., 224, 224, 224],
              [224, 224, 224, ..., 222, 213, 224],
              ...,
              [224, 224, 224, ..., 220, 224, 214],
              [207, 213, 224, ..., 221, 224, 209],
              [222, 222, 224, ..., 224, 224, 224]],
      
             [[ 95, 100,  84, ...,  95,  79,  96],
              [ 99, 100,  91, ...,  96,  96,  96],
              [100, 100, 100, ...,  94,  86,  96],
              ...,
              [125,  96,  96, ...,  92,  96,  86],
              [ 82,  85,  96, ...,  93,  96,  81],
              [ 94,  94,  96, ...,  96,  96,  96]]], dtype=uint8)
    • qa_radsat
      (time, y, x)
      uint16
      0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'qa_radsat': {'bits': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], 'values': {'1': 'Band 1 Data Saturation', '2': 'Band 2 Data Saturation', '4': 'Band 3 Data Saturation', '8': 'Band 4 Data Saturation', '16': 'Band 5 Data Saturation', '32': 'Band 6 Data Saturation', '64': 'Band 7 Data Saturation', '128': 'Unused', '256': 'Band 9 Data Saturation', '512': 'Unused', '1024': 'Unused', '2048': 'Terrain occlusion'}, 'description': 'Radiometric saturation'}, 'b1_saturation': {'bits': 0, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b2_saturation': {'bits': 1, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b3_saturation': {'bits': 2, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b4_saturation': {'bits': 3, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b5_saturation': {'bits': 4, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b6_saturation': {'bits': 5, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b7_saturation': {'bits': 6, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b9_saturation': {'bits': 8, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'terrain_occlusion': {'bits': 11, 'values': {'0': 'no_terrain_occlusion', '1': 'terrain_occlusion'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]],
      
             [[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]],
      
             [[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
      ...
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]],
      
             [[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]],
      
             [[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]]], dtype=uint16)
    • time
      PandasIndex
      PandasIndex(DatetimeIndex(['2022-02-07 14:39:10.740819', '2022-02-23 14:39:05.143152',
                     '2022-03-11 14:39:01.493587', '2022-03-27 14:38:49.719285',
                     '2022-04-12 14:38:53.220719', '2022-04-28 14:38:48.578403'],
                    dtype='datetime64[ns]', name='time', freq=None))
    • y
      PandasIndex
      PandasIndex(Float64Index([6692085.0, 6692055.0, 6692025.0, 6691995.0, 6691965.0, 6691935.0,
                    6691905.0, 6691875.0, 6691845.0, 6691815.0,
                    ...
                    6680955.0, 6680925.0, 6680895.0, 6680865.0, 6680835.0, 6680805.0,
                    6680775.0, 6680745.0, 6680715.0, 6680685.0],
                   dtype='float64', name='y', length=381))
    • x
      PandasIndex
      PandasIndex(Float64Index([857175.0, 857205.0, 857235.0, 857265.0, 857295.0, 857325.0,
                    857355.0, 857385.0, 857415.0, 857445.0,
                    ...
                    866925.0, 866955.0, 866985.0, 867015.0, 867045.0, 867075.0,
                    867105.0, 867135.0, 867165.0, 867195.0],
                   dtype='float64', name='x', length=335))
  • crs :
    epsg:32718
    grid_mapping :
    spatial_ref

Next up filter out pixels which are affect by clouds and other issues and compute the NDVI. Since we aren't specifying a time range this will be performed for all images.

In [7]:
%%time
# Identify pixels that don't have cloud, cloud shadow or water
from datacube.utils import masking

good_pixel_flags = {
    'nodata': False,
    'cloud': 'not_high_confidence',
    'cloud_shadow': 'not_high_confidence',
    'water': 'land_or_cloud'
}

cloud_free_mask = masking.make_mask(dataset['qa_pixel'], **good_pixel_flags)

# Apply the mask
cloud_free = dataset.where(cloud_free_mask)

# Calculate the components that make up the NDVI calculation
band_diff = cloud_free.nir08 - cloud_free.red
band_sum = cloud_free.nir08 + cloud_free.red
# Calculate NDVI
ndvi = None
ndvi = band_diff / band_sum
CPU times: user 24.2 ms, sys: 12.1 ms, total: 36.3 ms
Wall time: 35.9 ms

The result ndvi is an xarray.DataArray. Let's take a look at it. Again the notebook will render an html version of the data in summary form. Notice again the actual data values are being shown and that there are the same number of time slices as above and the x and y dimensions are identical.

In [8]:
ndvi
Out[8]:
<xarray.DataArray (time: 6, y: 381, x: 335)>
array([[[       nan,        nan,        nan, ...,        nan,
                nan,        nan],
        [       nan,        nan,        nan, ...,        nan,
                nan,        nan],
        [       nan,        nan,        nan, ...,        nan,
                nan,        nan],
        ...,
        [       nan,        nan,        nan, ...,        nan,
                nan,        nan],
        [       nan,        nan,        nan, ...,        nan,
                nan,        nan],
        [       nan,        nan,        nan, ...,        nan,
                nan,        nan]],

       [[       nan,        nan,        nan, ...,        nan,
                nan,        nan],
        [       nan,        nan,        nan, ...,        nan,
                nan,        nan],
        [       nan,        nan,        nan, ...,        nan,
                nan,        nan],
...
        [       nan,        nan,        nan, ...,        nan,
                nan,        nan],
        [       nan,        nan,        nan, ...,        nan,
                nan,        nan],
        [       nan,        nan,        nan, ...,        nan,
                nan,        nan]],

       [[       nan,        nan,        nan, ..., 0.06816315,
         0.07301709, 0.07054072],
        [       nan,        nan,        nan, ..., 0.06340865,
         0.06887292, 0.06703169],
        [       nan,        nan,        nan, ..., 0.06328423,
         0.06745553, 0.0685059 ],
        ...,
        [0.06371912, 0.09801642, 0.08978288, ..., 0.04389915,
         0.0333082 , 0.03885442],
        [0.07099895, 0.07649794, 0.06243465, ..., 0.04756806,
         0.03867994, 0.03976882],
        [0.04632761, 0.04720268, 0.04689552, ..., 0.05914572,
         0.0622024 , 0.05805989]]])
Coordinates:
  * time         (time) datetime64[ns] 2022-02-07T14:39:10.740819 ... 2022-04...
  * y            (y) float64 6.692e+06 6.692e+06 ... 6.681e+06 6.681e+06
  * x            (x) float64 8.572e+05 8.572e+05 ... 8.672e+05 8.672e+05
    spatial_ref  int32 32718
xarray.DataArray
  • time: 6
  • y: 381
  • x: 335
  • nan nan nan nan nan nan ... 0.0526 0.0561 0.05915 0.0622 0.05806
    array([[[       nan,        nan,        nan, ...,        nan,
                    nan,        nan],
            [       nan,        nan,        nan, ...,        nan,
                    nan,        nan],
            [       nan,        nan,        nan, ...,        nan,
                    nan,        nan],
            ...,
            [       nan,        nan,        nan, ...,        nan,
                    nan,        nan],
            [       nan,        nan,        nan, ...,        nan,
                    nan,        nan],
            [       nan,        nan,        nan, ...,        nan,
                    nan,        nan]],
    
           [[       nan,        nan,        nan, ...,        nan,
                    nan,        nan],
            [       nan,        nan,        nan, ...,        nan,
                    nan,        nan],
            [       nan,        nan,        nan, ...,        nan,
                    nan,        nan],
    ...
            [       nan,        nan,        nan, ...,        nan,
                    nan,        nan],
            [       nan,        nan,        nan, ...,        nan,
                    nan,        nan],
            [       nan,        nan,        nan, ...,        nan,
                    nan,        nan]],
    
           [[       nan,        nan,        nan, ..., 0.06816315,
             0.07301709, 0.07054072],
            [       nan,        nan,        nan, ..., 0.06340865,
             0.06887292, 0.06703169],
            [       nan,        nan,        nan, ..., 0.06328423,
             0.06745553, 0.0685059 ],
            ...,
            [0.06371912, 0.09801642, 0.08978288, ..., 0.04389915,
             0.0333082 , 0.03885442],
            [0.07099895, 0.07649794, 0.06243465, ..., 0.04756806,
             0.03867994, 0.03976882],
            [0.04632761, 0.04720268, 0.04689552, ..., 0.05914572,
             0.0622024 , 0.05805989]]])
    • time
      (time)
      datetime64[ns]
      2022-02-07T14:39:10.740819 ... 2...
      units :
      seconds since 1970-01-01 00:00:00
      array(['2022-02-07T14:39:10.740819000', '2022-02-23T14:39:05.143152000',
             '2022-03-11T14:39:01.493587000', '2022-03-27T14:38:49.719285000',
             '2022-04-12T14:38:53.220719000', '2022-04-28T14:38:48.578403000'],
            dtype='datetime64[ns]')
    • y
      (y)
      float64
      6.692e+06 6.692e+06 ... 6.681e+06
      units :
      metre
      resolution :
      -30.0
      crs :
      epsg:32718
      array([6692085., 6692055., 6692025., ..., 6680745., 6680715., 6680685.])
    • x
      (x)
      float64
      8.572e+05 8.572e+05 ... 8.672e+05
      units :
      metre
      resolution :
      30.0
      crs :
      epsg:32718
      array([857175., 857205., 857235., ..., 867135., 867165., 867195.])
    • spatial_ref
      ()
      int32
      32718
      spatial_ref :
      PROJCS["WGS 84 / UTM zone 18S",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-75],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",10000000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","32718"]]
      grid_mapping_name :
      transverse_mercator
      array(32718, dtype=int32)
    • time
      PandasIndex
      PandasIndex(DatetimeIndex(['2022-02-07 14:39:10.740819', '2022-02-23 14:39:05.143152',
                     '2022-03-11 14:39:01.493587', '2022-03-27 14:38:49.719285',
                     '2022-04-12 14:38:53.220719', '2022-04-28 14:38:48.578403'],
                    dtype='datetime64[ns]', name='time', freq=None))
    • y
      PandasIndex
      PandasIndex(Float64Index([6692085.0, 6692055.0, 6692025.0, 6691995.0, 6691965.0, 6691935.0,
                    6691905.0, 6691875.0, 6691845.0, 6691815.0,
                    ...
                    6680955.0, 6680925.0, 6680895.0, 6680865.0, 6680835.0, 6680805.0,
                    6680775.0, 6680745.0, 6680715.0, 6680685.0],
                   dtype='float64', name='y', length=381))
    • x
      PandasIndex
      PandasIndex(Float64Index([857175.0, 857205.0, 857235.0, 857265.0, 857295.0, 857325.0,
                    857355.0, 857385.0, 857415.0, 857445.0,
                    ...
                    866925.0, 866955.0, 866985.0, 867015.0, 867045.0, 867075.0,
                    867105.0, 867135.0, 867165.0, 867195.0],
                   dtype='float64', name='x', length=335))

Raw numbers aren't nice to look at so let's draw a time slice. We'll select just one of them to draw and pick one that didn't get masked out by cloud completely. You can see that all clouds and water has been masked out so that we are just looking at the NDVI of the land area.

In [9]:
ndvi.isel(time=1).plot()
Out[9]:
<matplotlib.collections.QuadMesh at 0x7f49f481afe0>

Exploring Dask with the ODC - Concepts¶

Let's set our time range to a couple of weeks, or approximately two passes of Landsat 8 for this ROI. Less data will allow us to explore how dask works with the datacube and xarray libraries.

In [10]:
set_time = ("2021-01-01", "2021-01-14")
In [11]:
%%time
dataset = None # clear results from any previous runs
dataset = dc.load(
            product=product,
            x=study_area_lon,
            y=study_area_lat,
            time=set_time,
            measurements=measurements,
            resampling={"qa_pixel": "nearest", "*": "average"},
            output_crs=set_crs,
            resolution=set_resolution,
            group_by=group_by,
        )
dataset
CPU times: user 359 ms, sys: 769 ms, total: 1.13 s
Wall time: 4.36 s
Out[11]:
<xarray.Dataset>
Dimensions:      (time: 1, y: 381, x: 335)
Coordinates:
  * time         (time) datetime64[ns] 2021-01-03T14:39:19.317361
  * y            (y) float64 6.692e+06 6.692e+06 ... 6.681e+06 6.681e+06
  * x            (x) float64 8.572e+05 8.572e+05 ... 8.672e+05 8.672e+05
    spatial_ref  int32 32718
Data variables:
    coastal      (time, y, x) uint16 38342 38311 37994 ... 43075 43200 43245
    blue         (time, y, x) uint16 38117 38116 37811 ... 43014 43109 43154
    green        (time, y, x) uint16 36410 36469 36214 ... 41419 41545 41591
    red          (time, y, x) uint16 36244 36332 36158 ... 41362 41495 41545
    nir08        (time, y, x) uint16 35466 35587 35444 ... 40546 40676 40735
    swir16       (time, y, x) uint16 28398 28591 28442 ... 28314 28429 28429
    swir22       (time, y, x) uint16 22729 22865 22740 ... 21030 21133 21130
    qa_pixel     (time, y, x) uint16 22280 22280 22280 ... 22280 22280 22280
    qa_aerosol   (time, y, x) uint8 224 206 220 224 210 ... 224 224 224 224 224
    qa_radsat    (time, y, x) uint16 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
Attributes:
    crs:           epsg:32718
    grid_mapping:  spatial_ref
xarray.Dataset
    • time: 1
    • y: 381
    • x: 335
    • time
      (time)
      datetime64[ns]
      2021-01-03T14:39:19.317361
      units :
      seconds since 1970-01-01 00:00:00
      array(['2021-01-03T14:39:19.317361000'], dtype='datetime64[ns]')
    • y
      (y)
      float64
      6.692e+06 6.692e+06 ... 6.681e+06
      units :
      metre
      resolution :
      -30.0
      crs :
      epsg:32718
      array([6692085., 6692055., 6692025., ..., 6680745., 6680715., 6680685.])
    • x
      (x)
      float64
      8.572e+05 8.572e+05 ... 8.672e+05
      units :
      metre
      resolution :
      30.0
      crs :
      epsg:32718
      array([857175., 857205., 857235., ..., 867135., 867165., 867195.])
    • spatial_ref
      ()
      int32
      32718
      spatial_ref :
      PROJCS["WGS 84 / UTM zone 18S",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-75],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",10000000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","32718"]]
      grid_mapping_name :
      transverse_mercator
      array(32718, dtype=int32)
    • coastal
      (time, y, x)
      uint16
      38342 38311 37994 ... 43200 43245
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[38342, 38311, 37994, ..., 44083, 43955, 43934],
              [38034, 37992, 37796, ..., 44049, 43865, 43804],
              [37705, 37644, 37454, ..., 44028, 43837, 43745],
              ...,
              [39221, 39599, 40046, ..., 43182, 43265, 43277],
              [39230, 39480, 39864, ..., 43105, 43200, 43240],
              [39448, 39398, 39575, ..., 43075, 43200, 43245]]], dtype=uint16)
    • blue
      (time, y, x)
      uint16
      38117 38116 37811 ... 43109 43154
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[38117, 38116, 37811, ..., 44195, 44015, 43985],
              [37834, 37739, 37606, ..., 44154, 43935, 43845],
              [37501, 37389, 37268, ..., 44109, 43910, 43793],
              ...,
              [39101, 39429, 39853, ..., 43080, 43194, 43213],
              [39131, 39336, 39679, ..., 43039, 43139, 43150],
              [39295, 39238, 39424, ..., 43014, 43109, 43154]]], dtype=uint16)
    • green
      (time, y, x)
      uint16
      36410 36469 36214 ... 41545 41591
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[36410, 36469, 36214, ..., 42603, 42349, 42321],
              [36123, 36094, 35915, ..., 42559, 42260, 42164],
              [35816, 35769, 35624, ..., 42484, 42263, 42119],
              ...,
              [37460, 37698, 38115, ..., 41499, 41619, 41638],
              [37501, 37594, 37906, ..., 41446, 41559, 41580],
              [37688, 37538, 37694, ..., 41419, 41545, 41591]]], dtype=uint16)
    • red
      (time, y, x)
      uint16
      36244 36332 36158 ... 41495 41545
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[36244, 36332, 36158, ..., 42967, 42709, 42677],
              [35998, 35979, 35849, ..., 42915, 42607, 42484],
              [35682, 35681, 35549, ..., 42836, 42604, 42447],
              ...,
              [37428, 37620, 37992, ..., 41429, 41547, 41576],
              [37429, 37471, 37771, ..., 41378, 41484, 41530],
              [37592, 37433, 37576, ..., 41362, 41495, 41545]]], dtype=uint16)
    • nir08
      (time, y, x)
      uint16
      35466 35587 35444 ... 40676 40735
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[35466, 35587, 35444, ..., 42378, 42091, 42058],
              [35230, 35267, 35165, ..., 42307, 41987, 41873],
              [35002, 35009, 34904, ..., 42219, 41975, 41824],
              ...,
              [36732, 36868, 37218, ..., 40664, 40748, 40799],
              [36761, 36768, 37030, ..., 40594, 40696, 40692],
              [36954, 36731, 36816, ..., 40546, 40676, 40735]]], dtype=uint16)
    • swir16
      (time, y, x)
      uint16
      28398 28591 28442 ... 28429 28429
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[28398, 28591, 28442, ..., 30118, 29786, 29930],
              [28186, 28312, 28221, ..., 30032, 29642, 29664],
              [27996, 28111, 27998, ..., 29934, 29623, 29559],
              ...,
              [26532, 26787, 27321, ..., 28380, 28445, 28429],
              [26589, 26657, 27083, ..., 28320, 28394, 28388],
              [26866, 26650, 26779, ..., 28314, 28429, 28429]]], dtype=uint16)
    • swir22
      (time, y, x)
      uint16
      22729 22865 22740 ... 21133 21130
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[22729, 22865, 22740, ..., 23923, 23630, 23752],
              [22547, 22652, 22581, ..., 23855, 23479, 23526],
              [22403, 22502, 22400, ..., 23803, 23465, 23435],
              ...,
              [19875, 20162, 20651, ..., 21104, 21147, 21137],
              [19995, 20112, 20494, ..., 21035, 21088, 21086],
              [20260, 20105, 20246, ..., 21030, 21133, 21130]]], dtype=uint16)
    • qa_pixel
      (time, y, x)
      uint16
      22280 22280 22280 ... 22280 22280
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'snow': {'bits': 5, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'clear': {'bits': 6, 'values': {'0': 'not_clear', '1': 'clear'}}, 'cloud': {'bits': 3, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'water': {'bits': 7, 'values': {'0': 'land_or_cloud', '1': 'water'}}, 'cirrus': {'bits': 2, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'nodata': {'bits': 0, 'values': {'0': False, '1': True}}, 'qa_pixel': {'bits': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15], 'values': {'1': 'Fill', '2': 'Dilated Cloud', '4': 'Cirrus', '8': 'Cloud', '16': 'Cloud Shadow', '32': 'Snow', '64': 'Clear', '128': 'Water', '256': 'Cloud Confidence low bit', '512': 'Cloud Confidence high bit', '1024': 'Cloud Shadow Confidence low bit', '2048': 'Cloud Shadow Confidence high bit', '4096': 'Snow Ice Confidence low bit', '8192': 'Snow Ice Confidence high bit', '16384': 'Cirrus Confidence low bit', '32768': 'Cirrus Confidence high bit'}, 'description': 'Level 2 pixel quality'}, 'cloud_shadow': {'bits': 4, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'dilated_cloud': {'bits': 1, 'values': {'0': 'not_dilated', '1': 'dilated'}}, 'cloud_confidence': {'bits': [8, 9], 'values': {'0': 'none', '1': 'low', '2': 'medium', '3': 'high'}}, 'cirrus_confidence': {'bits': [14, 15], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}, 'snow_ice_confidence': {'bits': [12, 13], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}, 'cloud_shadow_confidence': {'bits': [10, 11], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              ...,
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280]]], dtype=uint16)
    • qa_aerosol
      (time, y, x)
      uint8
      224 206 220 224 ... 224 224 224 224
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'water': {'bits': 2, 'values': {'0': 'not_water', '1': 'water'}}, 'nodata': {'bits': 0, 'values': {'0': False, '1': True}}, 'qa_aerosol': {'bits': [0, 1, 2, 3, 4, 5, 6, 7], 'values': {'1': 'Fill', '2': 'Valid aerosol retrieval', '4': 'Water', '8': 'Unused', '16': 'Unused', '32': 'Interpolated Aerosol', '64': 'Aerosol Level low bit', '128': 'Aerosol Level high bit'}, 'description': 'Aerosol quality assessment'}, 'aerosol_level': {'bits': [6, 7], 'values': {'0': 'climatology', '1': 'low', '2': 'medium', '3': 'high'}}, 'valid_retrieval': {'bits': 1, 'values': {'0': 'not_valid', '1': 'valid'}}, 'interp_retrieval': {'bits': 5, 'values': {'0': 'not_aerosol_interpolated', '1': 'aerosol_interpolated'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[224, 206, 220, ..., 208, 224, 223],
              [224, 216, 222, ..., 224, 224, 224],
              [224, 224, 224, ..., 212, 224, 222],
              ...,
              [224, 224, 224, ..., 224, 213, 220],
              [212, 224, 209, ..., 224, 211, 220],
              [223, 224, 219, ..., 224, 224, 224]]], dtype=uint8)
    • qa_radsat
      (time, y, x)
      uint16
      0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'qa_radsat': {'bits': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], 'values': {'1': 'Band 1 Data Saturation', '2': 'Band 2 Data Saturation', '4': 'Band 3 Data Saturation', '8': 'Band 4 Data Saturation', '16': 'Band 5 Data Saturation', '32': 'Band 6 Data Saturation', '64': 'Band 7 Data Saturation', '128': 'Unused', '256': 'Band 9 Data Saturation', '512': 'Unused', '1024': 'Unused', '2048': 'Terrain occlusion'}, 'description': 'Radiometric saturation'}, 'b1_saturation': {'bits': 0, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b2_saturation': {'bits': 1, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b3_saturation': {'bits': 2, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b4_saturation': {'bits': 3, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b5_saturation': {'bits': 4, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b6_saturation': {'bits': 5, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b7_saturation': {'bits': 6, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b9_saturation': {'bits': 8, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'terrain_occlusion': {'bits': 11, 'values': {'0': 'no_terrain_occlusion', '1': 'terrain_occlusion'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]]], dtype=uint16)
    • time
      PandasIndex
      PandasIndex(DatetimeIndex(['2021-01-03 14:39:19.317361'], dtype='datetime64[ns]', name='time', freq=None))
    • y
      PandasIndex
      PandasIndex(Float64Index([6692085.0, 6692055.0, 6692025.0, 6691995.0, 6691965.0, 6691935.0,
                    6691905.0, 6691875.0, 6691845.0, 6691815.0,
                    ...
                    6680955.0, 6680925.0, 6680895.0, 6680865.0, 6680835.0, 6680805.0,
                    6680775.0, 6680745.0, 6680715.0, 6680685.0],
                   dtype='float64', name='y', length=381))
    • x
      PandasIndex
      PandasIndex(Float64Index([857175.0, 857205.0, 857235.0, 857265.0, 857295.0, 857325.0,
                    857355.0, 857385.0, 857415.0, 857445.0,
                    ...
                    866925.0, 866955.0, 866985.0, 867015.0, 867045.0, 867075.0,
                    867105.0, 867135.0, 867165.0, 867195.0],
                   dtype='float64', name='x', length=335))
  • crs :
    epsg:32718
    grid_mapping :
    spatial_ref

As before you can see the actual data in the results but this time there should only be 1 or 2 observation times

Now let's create a LocalCluster as we did in the earlier notebook.

In [12]:
from dask.distributed import Client, LocalCluster

cluster = LocalCluster()
client = Client(cluster)
client
Out[12]:

Client

Client-c74f3414-fa49-11ed-99da-1eb1b782f397

Connection method: Cluster object Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:8787/status

Cluster Info

LocalCluster

d46c9864

Dashboard: http://127.0.0.1:8787/status Workers: 4
Total threads: 8 Total memory: 29.00 GiB
Status: running Using processes: True

Scheduler Info

Scheduler

Scheduler-04088dc5-463b-4f93-8651-13df22de11c5

Comm: tcp://127.0.0.1:36203 Workers: 4
Dashboard: http://127.0.0.1:8787/status Total threads: 8
Started: Just now Total memory: 29.00 GiB

Workers

Worker: 0

Comm: tcp://127.0.0.1:46441 Total threads: 2
Dashboard: http://127.0.0.1:46687/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:46815
Local directory: /tmp/dask-worker-space/worker-oe2jypeh

Worker: 1

Comm: tcp://127.0.0.1:41515 Total threads: 2
Dashboard: http://127.0.0.1:39339/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:38017
Local directory: /tmp/dask-worker-space/worker-qs4kvx5w

Worker: 2

Comm: tcp://127.0.0.1:38099 Total threads: 2
Dashboard: http://127.0.0.1:41629/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:41837
Local directory: /tmp/dask-worker-space/worker-gp8eu0fp

Worker: 3

Comm: tcp://127.0.0.1:43711 Total threads: 2
Dashboard: http://127.0.0.1:42069/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:36811
Local directory: /tmp/dask-worker-space/worker-bsblm3vv

You may like to open up the dashboard for the cluster, although for this notebook we won't be talking about the dashboard (that's for a later discussion).

In [13]:
notebook_utils.localcluster_dashboard(client=client,server=easi.hub)
Out[13]:
'https://hub.datacubechile.cl/user/jhodge/proxy/8787/status'

Now that we are using a cluster, even though it is local, we need to make sure that our cluster has the right configuration to use Requester Pays buckets in AWS S3. To do this, we need to re-run the configure_s3_access() function that we ran earlier, but we need to pass the client to the function as well.

In [14]:
from datacube.utils.aws import configure_s3_access
configure_s3_access(aws_unsigned=False, requester_pays=True, client=client);

datacube.load() will use the default dask cluster (the one we just created) if the dask_chunks parameter is specified.

The chunk shape and memory size is a critial parameter in tuning dask and we will be discussing it in great detail as scale increases. For now we're simply going to specify that the time dimension should individually chunked (1 slice of time) and by not specifying any chunking for the other dimensions they will be form a single contiguous block.

If that made no sense what's so ever, that's fine because we will look at an example.

In [15]:
chunks = {"time":1}
In [16]:
%%time
dataset = None # clear results from any previous runs
dataset = dc.load(
            product=product,
            x=study_area_lon,
            y=study_area_lat,
            time=set_time,
            measurements=measurements,
            resampling={"qa_pixel": "nearest", "*": "average"},
            output_crs=set_crs,
            resolution=set_resolution,
            dask_chunks = chunks, ###### THIS IS THE ONLY LINE CHANGED. #####
            group_by=group_by,
        )
dataset
CPU times: user 28.3 ms, sys: 99 µs, total: 28.4 ms
Wall time: 35.5 ms
Out[16]:
<xarray.Dataset>
Dimensions:      (time: 1, y: 381, x: 335)
Coordinates:
  * time         (time) datetime64[ns] 2021-01-03T14:39:19.317361
  * y            (y) float64 6.692e+06 6.692e+06 ... 6.681e+06 6.681e+06
  * x            (x) float64 8.572e+05 8.572e+05 ... 8.672e+05 8.672e+05
    spatial_ref  int32 32718
Data variables:
    coastal      (time, y, x) uint16 dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
    blue         (time, y, x) uint16 dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
    green        (time, y, x) uint16 dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
    red          (time, y, x) uint16 dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
    nir08        (time, y, x) uint16 dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
    swir16       (time, y, x) uint16 dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
    swir22       (time, y, x) uint16 dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
    qa_pixel     (time, y, x) uint16 dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
    qa_aerosol   (time, y, x) uint8 dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
    qa_radsat    (time, y, x) uint16 dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
Attributes:
    crs:           epsg:32718
    grid_mapping:  spatial_ref
xarray.Dataset
    • time: 1
    • y: 381
    • x: 335
    • time
      (time)
      datetime64[ns]
      2021-01-03T14:39:19.317361
      units :
      seconds since 1970-01-01 00:00:00
      array(['2021-01-03T14:39:19.317361000'], dtype='datetime64[ns]')
    • y
      (y)
      float64
      6.692e+06 6.692e+06 ... 6.681e+06
      units :
      metre
      resolution :
      -30.0
      crs :
      epsg:32718
      array([6692085., 6692055., 6692025., ..., 6680745., 6680715., 6680685.])
    • x
      (x)
      float64
      8.572e+05 8.572e+05 ... 8.672e+05
      units :
      metre
      resolution :
      30.0
      crs :
      epsg:32718
      array([857175., 857205., 857235., ..., 867135., 867165., 867195.])
    • spatial_ref
      ()
      int32
      32718
      spatial_ref :
      PROJCS["WGS 84 / UTM zone 18S",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-75],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",10000000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","32718"]]
      grid_mapping_name :
      transverse_mercator
      array(32718, dtype=int32)
    • coastal
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 249.29 kiB 249.29 kiB
      Shape (1, 381, 335) (1, 381, 335)
      Dask graph 1 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      335 381 1
    • blue
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 249.29 kiB 249.29 kiB
      Shape (1, 381, 335) (1, 381, 335)
      Dask graph 1 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      335 381 1
    • green
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 249.29 kiB 249.29 kiB
      Shape (1, 381, 335) (1, 381, 335)
      Dask graph 1 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      335 381 1
    • red
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 249.29 kiB 249.29 kiB
      Shape (1, 381, 335) (1, 381, 335)
      Dask graph 1 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      335 381 1
    • nir08
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 249.29 kiB 249.29 kiB
      Shape (1, 381, 335) (1, 381, 335)
      Dask graph 1 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      335 381 1
    • swir16
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 249.29 kiB 249.29 kiB
      Shape (1, 381, 335) (1, 381, 335)
      Dask graph 1 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      335 381 1
    • swir22
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 249.29 kiB 249.29 kiB
      Shape (1, 381, 335) (1, 381, 335)
      Dask graph 1 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      335 381 1
    • qa_pixel
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'snow': {'bits': 5, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'clear': {'bits': 6, 'values': {'0': 'not_clear', '1': 'clear'}}, 'cloud': {'bits': 3, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'water': {'bits': 7, 'values': {'0': 'land_or_cloud', '1': 'water'}}, 'cirrus': {'bits': 2, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'nodata': {'bits': 0, 'values': {'0': False, '1': True}}, 'qa_pixel': {'bits': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15], 'values': {'1': 'Fill', '2': 'Dilated Cloud', '4': 'Cirrus', '8': 'Cloud', '16': 'Cloud Shadow', '32': 'Snow', '64': 'Clear', '128': 'Water', '256': 'Cloud Confidence low bit', '512': 'Cloud Confidence high bit', '1024': 'Cloud Shadow Confidence low bit', '2048': 'Cloud Shadow Confidence high bit', '4096': 'Snow Ice Confidence low bit', '8192': 'Snow Ice Confidence high bit', '16384': 'Cirrus Confidence low bit', '32768': 'Cirrus Confidence high bit'}, 'description': 'Level 2 pixel quality'}, 'cloud_shadow': {'bits': 4, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'dilated_cloud': {'bits': 1, 'values': {'0': 'not_dilated', '1': 'dilated'}}, 'cloud_confidence': {'bits': [8, 9], 'values': {'0': 'none', '1': 'low', '2': 'medium', '3': 'high'}}, 'cirrus_confidence': {'bits': [14, 15], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}, 'snow_ice_confidence': {'bits': [12, 13], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}, 'cloud_shadow_confidence': {'bits': [10, 11], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 249.29 kiB 249.29 kiB
      Shape (1, 381, 335) (1, 381, 335)
      Dask graph 1 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      335 381 1
    • qa_aerosol
      (time, y, x)
      uint8
      dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'water': {'bits': 2, 'values': {'0': 'not_water', '1': 'water'}}, 'nodata': {'bits': 0, 'values': {'0': False, '1': True}}, 'qa_aerosol': {'bits': [0, 1, 2, 3, 4, 5, 6, 7], 'values': {'1': 'Fill', '2': 'Valid aerosol retrieval', '4': 'Water', '8': 'Unused', '16': 'Unused', '32': 'Interpolated Aerosol', '64': 'Aerosol Level low bit', '128': 'Aerosol Level high bit'}, 'description': 'Aerosol quality assessment'}, 'aerosol_level': {'bits': [6, 7], 'values': {'0': 'climatology', '1': 'low', '2': 'medium', '3': 'high'}}, 'valid_retrieval': {'bits': 1, 'values': {'0': 'not_valid', '1': 'valid'}}, 'interp_retrieval': {'bits': 5, 'values': {'0': 'not_aerosol_interpolated', '1': 'aerosol_interpolated'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 124.64 kiB 124.64 kiB
      Shape (1, 381, 335) (1, 381, 335)
      Dask graph 1 chunks in 1 graph layer
      Data type uint8 numpy.ndarray
      335 381 1
    • qa_radsat
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'qa_radsat': {'bits': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], 'values': {'1': 'Band 1 Data Saturation', '2': 'Band 2 Data Saturation', '4': 'Band 3 Data Saturation', '8': 'Band 4 Data Saturation', '16': 'Band 5 Data Saturation', '32': 'Band 6 Data Saturation', '64': 'Band 7 Data Saturation', '128': 'Unused', '256': 'Band 9 Data Saturation', '512': 'Unused', '1024': 'Unused', '2048': 'Terrain occlusion'}, 'description': 'Radiometric saturation'}, 'b1_saturation': {'bits': 0, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b2_saturation': {'bits': 1, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b3_saturation': {'bits': 2, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b4_saturation': {'bits': 3, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b5_saturation': {'bits': 4, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b6_saturation': {'bits': 5, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b7_saturation': {'bits': 6, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b9_saturation': {'bits': 8, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'terrain_occlusion': {'bits': 11, 'values': {'0': 'no_terrain_occlusion', '1': 'terrain_occlusion'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 249.29 kiB 249.29 kiB
      Shape (1, 381, 335) (1, 381, 335)
      Dask graph 1 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      335 381 1
    • time
      PandasIndex
      PandasIndex(DatetimeIndex(['2021-01-03 14:39:19.317361'], dtype='datetime64[ns]', name='time', freq=None))
    • y
      PandasIndex
      PandasIndex(Float64Index([6692085.0, 6692055.0, 6692025.0, 6691995.0, 6691965.0, 6691935.0,
                    6691905.0, 6691875.0, 6691845.0, 6691815.0,
                    ...
                    6680955.0, 6680925.0, 6680895.0, 6680865.0, 6680835.0, 6680805.0,
                    6680775.0, 6680745.0, 6680715.0, 6680685.0],
                   dtype='float64', name='y', length=381))
    • x
      PandasIndex
      PandasIndex(Float64Index([857175.0, 857205.0, 857235.0, 857265.0, 857295.0, 857325.0,
                    857355.0, 857385.0, 857415.0, 857445.0,
                    ...
                    866925.0, 866955.0, 866985.0, 867015.0, 867045.0, 867075.0,
                    867105.0, 867135.0, 867165.0, 867195.0],
                   dtype='float64', name='x', length=335))
  • crs :
    epsg:32718
    grid_mapping :
    spatial_ref

First thing you probably noticed is that whilst only one line changed the load time dropped to sub-seconds! The second thing you probably noticed is if you look at one of the data variables by clicking on the database icon as before, there is no data but instead there is a diagram which shows you the Dask Chunks for each measurement. It's really fast because it didn't actually load any data!

When datatcube has dask_chunks specified it switches from creating xarrays to instead use dask.arrays in the backend and lazy loads them - this means that no data is loaded until used. If you look at one of the data variables you will see it now has dask.array<chunksize=(....)> rather than values and the cylinder icon will show the Array and Chunk parameters along with some statistics, not actual data.

The datacube.load() has used the dask.Delayed interface which will not perform any tasks (Dask's name for calculations) until the result of the task is actually required. We'll load the data in a moment but first let's take a look at the parameters in that pretty visualisation. Click on the cylinder for the red Data variables and look at the table and the figure. It should look similar to the image below.

Looking at this image (yours may be different), you can see that:

  1. The Array is 221.92 kiB in total size and is broken into Chunks which have size 110.96 kiB
  2. The Array shape is (2, 375, 303) (time, y, x) but each chunk is (1,375,303) because we specified the time dimension should have chunks of length 1.
  3. There are 2 chunk tasks, one for each time slice and in this instance, only one graph layer. More complex calculations will have more layers in the graph.
  4. The Array type is uint16 and is split up into chunks which are numpy.ndarrays.

The chunking has split the array loading into two Chunks. Dask can execute these in parallel.

We can look at the delayed tasks and how they will be executed by visualising the task graph for one of the variables. We'll use the red band measurement.

In [17]:
dataset.red.data.visualize()
Out[17]:

Details on the task graph can be found in the dask user guide but what's clear is you have two independent paths of execution which produce one time slice each (0,0,0) and (1,0,0) these are the two chunks that that full array has been split into.

To retrieve the actual data we need to compute() the result, this will cause all the delayed tasks to be executed for the variable we are computing. Let's compute() the red variable.

In [18]:
%%time
actual_red = dataset.red.compute()
actual_red
CPU times: user 140 ms, sys: 37.8 ms, total: 178 ms
Wall time: 1.38 s
Out[18]:
<xarray.DataArray 'red' (time: 1, y: 381, x: 335)>
array([[[36244, 36332, 36158, ..., 42967, 42709, 42677],
        [35998, 35979, 35849, ..., 42915, 42607, 42484],
        [35682, 35681, 35549, ..., 42836, 42604, 42447],
        ...,
        [37428, 37620, 37992, ..., 41429, 41547, 41576],
        [37429, 37471, 37771, ..., 41378, 41484, 41530],
        [37592, 37433, 37576, ..., 41362, 41495, 41545]]], dtype=uint16)
Coordinates:
  * time         (time) datetime64[ns] 2021-01-03T14:39:19.317361
  * y            (y) float64 6.692e+06 6.692e+06 ... 6.681e+06 6.681e+06
  * x            (x) float64 8.572e+05 8.572e+05 ... 8.672e+05 8.672e+05
    spatial_ref  int32 32718
Attributes:
    units:         reflectance
    nodata:        0
    crs:           epsg:32718
    grid_mapping:  spatial_ref
xarray.DataArray
'red'
  • time: 1
  • y: 381
  • x: 335
  • 36244 36332 36158 35673 35143 35156 ... 41034 41180 41362 41495 41545
    array([[[36244, 36332, 36158, ..., 42967, 42709, 42677],
            [35998, 35979, 35849, ..., 42915, 42607, 42484],
            [35682, 35681, 35549, ..., 42836, 42604, 42447],
            ...,
            [37428, 37620, 37992, ..., 41429, 41547, 41576],
            [37429, 37471, 37771, ..., 41378, 41484, 41530],
            [37592, 37433, 37576, ..., 41362, 41495, 41545]]], dtype=uint16)
    • time
      (time)
      datetime64[ns]
      2021-01-03T14:39:19.317361
      units :
      seconds since 1970-01-01 00:00:00
      array(['2021-01-03T14:39:19.317361000'], dtype='datetime64[ns]')
    • y
      (y)
      float64
      6.692e+06 6.692e+06 ... 6.681e+06
      units :
      metre
      resolution :
      -30.0
      crs :
      epsg:32718
      array([6692085., 6692055., 6692025., ..., 6680745., 6680715., 6680685.])
    • x
      (x)
      float64
      8.572e+05 8.572e+05 ... 8.672e+05
      units :
      metre
      resolution :
      30.0
      crs :
      epsg:32718
      array([857175., 857205., 857235., ..., 867135., 867165., 867195.])
    • spatial_ref
      ()
      int32
      32718
      spatial_ref :
      PROJCS["WGS 84 / UTM zone 18S",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-75],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",10000000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","32718"]]
      grid_mapping_name :
      transverse_mercator
      array(32718, dtype=int32)
    • time
      PandasIndex
      PandasIndex(DatetimeIndex(['2021-01-03 14:39:19.317361'], dtype='datetime64[ns]', name='time', freq=None))
    • y
      PandasIndex
      PandasIndex(Float64Index([6692085.0, 6692055.0, 6692025.0, 6691995.0, 6691965.0, 6691935.0,
                    6691905.0, 6691875.0, 6691845.0, 6691815.0,
                    ...
                    6680955.0, 6680925.0, 6680895.0, 6680865.0, 6680835.0, 6680805.0,
                    6680775.0, 6680745.0, 6680715.0, 6680685.0],
                   dtype='float64', name='y', length=381))
    • x
      PandasIndex
      PandasIndex(Float64Index([857175.0, 857205.0, 857235.0, 857265.0, 857295.0, 857325.0,
                    857355.0, 857385.0, 857415.0, 857445.0,
                    ...
                    866925.0, 866955.0, 866985.0, 867015.0, 867045.0, 867075.0,
                    867105.0, 867135.0, 867165.0, 867195.0],
                   dtype='float64', name='x', length=335))
  • units :
    reflectance
    nodata :
    0
    crs :
    epsg:32718
    grid_mapping :
    spatial_ref

As you can see we now have actual data (there are real numbers, not just Dask arrays). You can do the same thing for all arrays in the dataset in one go by computing the dataset itself.

In [19]:
%%time
actual_dataset = dataset.compute()
actual_dataset
CPU times: user 81.7 ms, sys: 614 µs, total: 82.3 ms
Wall time: 1.33 s
Out[19]:
<xarray.Dataset>
Dimensions:      (time: 1, y: 381, x: 335)
Coordinates:
  * time         (time) datetime64[ns] 2021-01-03T14:39:19.317361
  * y            (y) float64 6.692e+06 6.692e+06 ... 6.681e+06 6.681e+06
  * x            (x) float64 8.572e+05 8.572e+05 ... 8.672e+05 8.672e+05
    spatial_ref  int32 32718
Data variables:
    coastal      (time, y, x) uint16 38342 38311 37994 ... 43075 43200 43245
    blue         (time, y, x) uint16 38117 38116 37811 ... 43014 43109 43154
    green        (time, y, x) uint16 36410 36469 36214 ... 41419 41545 41591
    red          (time, y, x) uint16 36244 36332 36158 ... 41362 41495 41545
    nir08        (time, y, x) uint16 35466 35587 35444 ... 40546 40676 40735
    swir16       (time, y, x) uint16 28398 28591 28442 ... 28314 28429 28429
    swir22       (time, y, x) uint16 22729 22865 22740 ... 21030 21133 21130
    qa_pixel     (time, y, x) uint16 22280 22280 22280 ... 22280 22280 22280
    qa_aerosol   (time, y, x) uint8 224 206 220 224 210 ... 224 224 224 224 224
    qa_radsat    (time, y, x) uint16 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
Attributes:
    crs:           epsg:32718
    grid_mapping:  spatial_ref
xarray.Dataset
    • time: 1
    • y: 381
    • x: 335
    • time
      (time)
      datetime64[ns]
      2021-01-03T14:39:19.317361
      units :
      seconds since 1970-01-01 00:00:00
      array(['2021-01-03T14:39:19.317361000'], dtype='datetime64[ns]')
    • y
      (y)
      float64
      6.692e+06 6.692e+06 ... 6.681e+06
      units :
      metre
      resolution :
      -30.0
      crs :
      epsg:32718
      array([6692085., 6692055., 6692025., ..., 6680745., 6680715., 6680685.])
    • x
      (x)
      float64
      8.572e+05 8.572e+05 ... 8.672e+05
      units :
      metre
      resolution :
      30.0
      crs :
      epsg:32718
      array([857175., 857205., 857235., ..., 867135., 867165., 867195.])
    • spatial_ref
      ()
      int32
      32718
      spatial_ref :
      PROJCS["WGS 84 / UTM zone 18S",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-75],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",10000000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","32718"]]
      grid_mapping_name :
      transverse_mercator
      array(32718, dtype=int32)
    • coastal
      (time, y, x)
      uint16
      38342 38311 37994 ... 43200 43245
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[38342, 38311, 37994, ..., 44083, 43955, 43934],
              [38034, 37992, 37796, ..., 44049, 43865, 43804],
              [37705, 37644, 37454, ..., 44028, 43837, 43745],
              ...,
              [39221, 39599, 40046, ..., 43182, 43265, 43277],
              [39230, 39480, 39864, ..., 43105, 43200, 43240],
              [39448, 39398, 39575, ..., 43075, 43200, 43245]]], dtype=uint16)
    • blue
      (time, y, x)
      uint16
      38117 38116 37811 ... 43109 43154
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[38117, 38116, 37811, ..., 44195, 44015, 43985],
              [37834, 37739, 37606, ..., 44154, 43935, 43845],
              [37501, 37389, 37268, ..., 44109, 43910, 43793],
              ...,
              [39101, 39429, 39853, ..., 43080, 43194, 43213],
              [39131, 39336, 39679, ..., 43039, 43139, 43150],
              [39295, 39238, 39424, ..., 43014, 43109, 43154]]], dtype=uint16)
    • green
      (time, y, x)
      uint16
      36410 36469 36214 ... 41545 41591
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[36410, 36469, 36214, ..., 42603, 42349, 42321],
              [36123, 36094, 35915, ..., 42559, 42260, 42164],
              [35816, 35769, 35624, ..., 42484, 42263, 42119],
              ...,
              [37460, 37698, 38115, ..., 41499, 41619, 41638],
              [37501, 37594, 37906, ..., 41446, 41559, 41580],
              [37688, 37538, 37694, ..., 41419, 41545, 41591]]], dtype=uint16)
    • red
      (time, y, x)
      uint16
      36244 36332 36158 ... 41495 41545
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[36244, 36332, 36158, ..., 42967, 42709, 42677],
              [35998, 35979, 35849, ..., 42915, 42607, 42484],
              [35682, 35681, 35549, ..., 42836, 42604, 42447],
              ...,
              [37428, 37620, 37992, ..., 41429, 41547, 41576],
              [37429, 37471, 37771, ..., 41378, 41484, 41530],
              [37592, 37433, 37576, ..., 41362, 41495, 41545]]], dtype=uint16)
    • nir08
      (time, y, x)
      uint16
      35466 35587 35444 ... 40676 40735
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[35466, 35587, 35444, ..., 42378, 42091, 42058],
              [35230, 35267, 35165, ..., 42307, 41987, 41873],
              [35002, 35009, 34904, ..., 42219, 41975, 41824],
              ...,
              [36732, 36868, 37218, ..., 40664, 40748, 40799],
              [36761, 36768, 37030, ..., 40594, 40696, 40692],
              [36954, 36731, 36816, ..., 40546, 40676, 40735]]], dtype=uint16)
    • swir16
      (time, y, x)
      uint16
      28398 28591 28442 ... 28429 28429
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[28398, 28591, 28442, ..., 30118, 29786, 29930],
              [28186, 28312, 28221, ..., 30032, 29642, 29664],
              [27996, 28111, 27998, ..., 29934, 29623, 29559],
              ...,
              [26532, 26787, 27321, ..., 28380, 28445, 28429],
              [26589, 26657, 27083, ..., 28320, 28394, 28388],
              [26866, 26650, 26779, ..., 28314, 28429, 28429]]], dtype=uint16)
    • swir22
      (time, y, x)
      uint16
      22729 22865 22740 ... 21133 21130
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[22729, 22865, 22740, ..., 23923, 23630, 23752],
              [22547, 22652, 22581, ..., 23855, 23479, 23526],
              [22403, 22502, 22400, ..., 23803, 23465, 23435],
              ...,
              [19875, 20162, 20651, ..., 21104, 21147, 21137],
              [19995, 20112, 20494, ..., 21035, 21088, 21086],
              [20260, 20105, 20246, ..., 21030, 21133, 21130]]], dtype=uint16)
    • qa_pixel
      (time, y, x)
      uint16
      22280 22280 22280 ... 22280 22280
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'snow': {'bits': 5, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'clear': {'bits': 6, 'values': {'0': 'not_clear', '1': 'clear'}}, 'cloud': {'bits': 3, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'water': {'bits': 7, 'values': {'0': 'land_or_cloud', '1': 'water'}}, 'cirrus': {'bits': 2, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'nodata': {'bits': 0, 'values': {'0': False, '1': True}}, 'qa_pixel': {'bits': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15], 'values': {'1': 'Fill', '2': 'Dilated Cloud', '4': 'Cirrus', '8': 'Cloud', '16': 'Cloud Shadow', '32': 'Snow', '64': 'Clear', '128': 'Water', '256': 'Cloud Confidence low bit', '512': 'Cloud Confidence high bit', '1024': 'Cloud Shadow Confidence low bit', '2048': 'Cloud Shadow Confidence high bit', '4096': 'Snow Ice Confidence low bit', '8192': 'Snow Ice Confidence high bit', '16384': 'Cirrus Confidence low bit', '32768': 'Cirrus Confidence high bit'}, 'description': 'Level 2 pixel quality'}, 'cloud_shadow': {'bits': 4, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'dilated_cloud': {'bits': 1, 'values': {'0': 'not_dilated', '1': 'dilated'}}, 'cloud_confidence': {'bits': [8, 9], 'values': {'0': 'none', '1': 'low', '2': 'medium', '3': 'high'}}, 'cirrus_confidence': {'bits': [14, 15], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}, 'snow_ice_confidence': {'bits': [12, 13], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}, 'cloud_shadow_confidence': {'bits': [10, 11], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              ...,
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280]]], dtype=uint16)
    • qa_aerosol
      (time, y, x)
      uint8
      224 206 220 224 ... 224 224 224 224
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'water': {'bits': 2, 'values': {'0': 'not_water', '1': 'water'}}, 'nodata': {'bits': 0, 'values': {'0': False, '1': True}}, 'qa_aerosol': {'bits': [0, 1, 2, 3, 4, 5, 6, 7], 'values': {'1': 'Fill', '2': 'Valid aerosol retrieval', '4': 'Water', '8': 'Unused', '16': 'Unused', '32': 'Interpolated Aerosol', '64': 'Aerosol Level low bit', '128': 'Aerosol Level high bit'}, 'description': 'Aerosol quality assessment'}, 'aerosol_level': {'bits': [6, 7], 'values': {'0': 'climatology', '1': 'low', '2': 'medium', '3': 'high'}}, 'valid_retrieval': {'bits': 1, 'values': {'0': 'not_valid', '1': 'valid'}}, 'interp_retrieval': {'bits': 5, 'values': {'0': 'not_aerosol_interpolated', '1': 'aerosol_interpolated'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[224, 206, 220, ..., 208, 224, 223],
              [224, 216, 222, ..., 224, 224, 224],
              [224, 224, 224, ..., 212, 224, 222],
              ...,
              [224, 224, 224, ..., 224, 213, 220],
              [212, 224, 209, ..., 224, 211, 220],
              [223, 224, 219, ..., 224, 224, 224]]], dtype=uint8)
    • qa_radsat
      (time, y, x)
      uint16
      0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'qa_radsat': {'bits': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], 'values': {'1': 'Band 1 Data Saturation', '2': 'Band 2 Data Saturation', '4': 'Band 3 Data Saturation', '8': 'Band 4 Data Saturation', '16': 'Band 5 Data Saturation', '32': 'Band 6 Data Saturation', '64': 'Band 7 Data Saturation', '128': 'Unused', '256': 'Band 9 Data Saturation', '512': 'Unused', '1024': 'Unused', '2048': 'Terrain occlusion'}, 'description': 'Radiometric saturation'}, 'b1_saturation': {'bits': 0, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b2_saturation': {'bits': 1, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b3_saturation': {'bits': 2, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b4_saturation': {'bits': 3, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b5_saturation': {'bits': 4, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b6_saturation': {'bits': 5, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b7_saturation': {'bits': 6, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b9_saturation': {'bits': 8, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'terrain_occlusion': {'bits': 11, 'values': {'0': 'no_terrain_occlusion', '1': 'terrain_occlusion'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]]], dtype=uint16)
    • time
      PandasIndex
      PandasIndex(DatetimeIndex(['2021-01-03 14:39:19.317361'], dtype='datetime64[ns]', name='time', freq=None))
    • y
      PandasIndex
      PandasIndex(Float64Index([6692085.0, 6692055.0, 6692025.0, 6691995.0, 6691965.0, 6691935.0,
                    6691905.0, 6691875.0, 6691845.0, 6691815.0,
                    ...
                    6680955.0, 6680925.0, 6680895.0, 6680865.0, 6680835.0, 6680805.0,
                    6680775.0, 6680745.0, 6680715.0, 6680685.0],
                   dtype='float64', name='y', length=381))
    • x
      PandasIndex
      PandasIndex(Float64Index([857175.0, 857205.0, 857235.0, 857265.0, 857295.0, 857325.0,
                    857355.0, 857385.0, 857415.0, 857445.0,
                    ...
                    866925.0, 866955.0, 866985.0, 867015.0, 867045.0, 867075.0,
                    867105.0, 867135.0, 867165.0, 867195.0],
                   dtype='float64', name='x', length=335))
  • crs :
    epsg:32718
    grid_mapping :
    spatial_ref

The impact of dask on ODC¶

From the above we can see that specifying dask_chunks in datacube.load() splits up the load() operation into a set of chunk shaped arrays and delayed tasks. Dask can now perform those tasks in parallel. Dask will only compute the results for those parts of the data we are using but we can force the computation of all the delayed tasks using compute().

There is a lot more opportunity than described in this simple example but let's just focus on the impact of dask on ODC for this simple case.

The time period and ROI are far too small to be interesting so let's change our time range to a full year of data.

In [20]:
set_time = ("2021-01-01", "2021-12-31")

First load the data without dask (no dask_chunks specified).

NOTE that this will take several minutes so be patient

In [21]:
%%time
dataset = None # clear results from any previous runs
dataset = dc.load(
            product=product,
            x=study_area_lon,
            y=study_area_lat,
            time=set_time,
            measurements=measurements,
            resampling={"qa_pixel": "nearest", "*": "average"},
            output_crs=set_crs,
            resolution=set_resolution,
            group_by=group_by,
        )
dataset
CPU times: user 9.65 s, sys: 17.5 s, total: 27.1 s
Wall time: 1min 26s
Out[21]:
<xarray.Dataset>
Dimensions:      (time: 22, y: 381, x: 335)
Coordinates:
  * time         (time) datetime64[ns] 2021-01-03T14:39:19.317361 ... 2021-12...
  * y            (y) float64 6.692e+06 6.692e+06 ... 6.681e+06 6.681e+06
  * x            (x) float64 8.572e+05 8.572e+05 ... 8.672e+05 8.672e+05
    spatial_ref  int32 32718
Data variables:
    coastal      (time, y, x) uint16 38342 38311 37994 37453 ... 9836 9832 9814
    blue         (time, y, x) uint16 38117 38116 37811 ... 10336 10392 10318
    green        (time, y, x) uint16 36410 36469 36214 ... 11842 11640 11518
    red          (time, y, x) uint16 36244 36332 36158 ... 13181 12926 12585
    nir08        (time, y, x) uint16 35466 35587 35444 ... 14604 14630 14254
    swir16       (time, y, x) uint16 28398 28591 28442 ... 17902 17175 16479
    swir22       (time, y, x) uint16 22729 22865 22740 ... 16694 16075 15613
    qa_pixel     (time, y, x) uint16 22280 22280 22280 ... 21824 21824 21824
    qa_aerosol   (time, y, x) uint8 224 206 220 224 210 ... 166 160 106 96 96
    qa_radsat    (time, y, x) uint16 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
Attributes:
    crs:           epsg:32718
    grid_mapping:  spatial_ref
xarray.Dataset
    • time: 22
    • y: 381
    • x: 335
    • time
      (time)
      datetime64[ns]
      2021-01-03T14:39:19.317361 ... 2...
      units :
      seconds since 1970-01-01 00:00:00
      array(['2021-01-03T14:39:19.317361000', '2021-01-19T14:39:11.996195000',
             '2021-02-04T14:39:10.642566000', '2021-02-20T14:39:06.231918000',
             '2021-03-08T14:38:58.547361000', '2021-03-24T14:38:51.737798000',
             '2021-04-09T14:38:47.161063000', '2021-04-25T14:38:39.521240000',
             '2021-05-11T14:38:36.106063000', '2021-05-27T14:38:45.945207000',
             '2021-06-12T14:38:52.799593000', '2021-06-28T14:38:56.818329000',
             '2021-07-14T14:38:58.020893000', '2021-07-30T14:39:06.209510000',
             '2021-08-31T14:39:16.967976000', '2021-09-16T14:39:20.745598000',
             '2021-10-02T14:39:25.506402000', '2021-10-18T14:39:29.217991000',
             '2021-11-03T14:39:28.562838000', '2021-11-19T14:39:23.470160000',
             '2021-12-05T14:39:25.260483000', '2021-12-21T14:39:22.530622000'],
            dtype='datetime64[ns]')
    • y
      (y)
      float64
      6.692e+06 6.692e+06 ... 6.681e+06
      units :
      metre
      resolution :
      -30.0
      crs :
      epsg:32718
      array([6692085., 6692055., 6692025., ..., 6680745., 6680715., 6680685.])
    • x
      (x)
      float64
      8.572e+05 8.572e+05 ... 8.672e+05
      units :
      metre
      resolution :
      30.0
      crs :
      epsg:32718
      array([857175., 857205., 857235., ..., 867135., 867165., 867195.])
    • spatial_ref
      ()
      int32
      32718
      spatial_ref :
      PROJCS["WGS 84 / UTM zone 18S",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-75],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",10000000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","32718"]]
      grid_mapping_name :
      transverse_mercator
      array(32718, dtype=int32)
    • coastal
      (time, y, x)
      uint16
      38342 38311 37994 ... 9832 9814
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[38342, 38311, 37994, ..., 44083, 43955, 43934],
              [38034, 37992, 37796, ..., 44049, 43865, 43804],
              [37705, 37644, 37454, ..., 44028, 43837, 43745],
              ...,
              [39221, 39599, 40046, ..., 43182, 43265, 43277],
              [39230, 39480, 39864, ..., 43105, 43200, 43240],
              [39448, 39398, 39575, ..., 43075, 43200, 43245]],
      
             [[27076, 26975, 27310, ..., 25533, 24978, 24973],
              [27157, 27323, 27590, ..., 26099, 26176, 26102],
              [27120, 27594, 27725, ..., 26409, 26891, 27563],
              ...,
              [33571, 33770, 33852, ..., 35482, 35125, 34172],
              [33678, 33867, 33903, ..., 35285, 34891, 33944],
              [33660, 33857, 33919, ..., 34770, 34488, 33868]],
      
             [[22430, 23160, 23876, ...,  7597,  8801,  8342],
              [22561, 22890, 23277, ...,  7435,  9220,  8240],
              [22800, 23003, 23224, ..., 10056, 10611,  7981],
              ...,
      ...
              ...,
              [40494, 40391, 40363, ..., 38437, 38234, 38675],
              [40572, 40452, 40365, ..., 38796, 38274, 38645],
              [40524, 40472, 40389, ..., 38892, 38439, 38663]],
      
             [[27891, 27549, 27452, ..., 27017, 27318, 27313],
              [28239, 28003, 27994, ..., 27182, 27370, 27468],
              [28520, 28420, 28530, ..., 27367, 27477, 27659],
              ...,
              [32798, 32936, 32995, ..., 13854, 13077, 12826],
              [32719, 32817, 32796, ..., 14677, 14220, 15676],
              [32457, 32784, 32884, ..., 16920, 17017, 18179]],
      
             [[ 7442,  7587,  7582, ...,  8433,  8483,  8972],
              [ 7443,  7609,  7562, ...,  8409,  8387,  8739],
              [ 7462,  7632,  7596, ...,  8375,  8239,  8454],
              ...,
              [ 9772, 10100, 10620, ...,  9249,  9296,  9214],
              [10608, 10539, 10712, ..., 10035,  9847,  9545],
              [10435, 10416, 10757, ...,  9836,  9832,  9814]]], dtype=uint16)
    • blue
      (time, y, x)
      uint16
      38117 38116 37811 ... 10392 10318
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[38117, 38116, 37811, ..., 44195, 44015, 43985],
              [37834, 37739, 37606, ..., 44154, 43935, 43845],
              [37501, 37389, 37268, ..., 44109, 43910, 43793],
              ...,
              [39101, 39429, 39853, ..., 43080, 43194, 43213],
              [39131, 39336, 39679, ..., 43039, 43139, 43150],
              [39295, 39238, 39424, ..., 43014, 43109, 43154]],
      
             [[27168, 27101, 27438, ..., 25928, 25379, 25442],
              [27305, 27505, 27710, ..., 26476, 26598, 26593],
              [27300, 27777, 27837, ..., 26754, 27253, 27887],
              ...,
              [33676, 33864, 33973, ..., 35803, 35529, 34526],
              [33757, 33941, 34012, ..., 35601, 35243, 34280],
              [33756, 33950, 34037, ..., 35154, 34816, 34224]],
      
             [[22372, 23106, 23847, ...,  8739,  9908,  9477],
              [22346, 22689, 23262, ...,  8539, 10370,  9571],
              [22770, 22982, 23282, ..., 10638, 11391,  9186],
              ...,
      ...
              ...,
              [40649, 40525, 40472, ..., 38548, 38228, 38640],
              [40698, 40530, 40427, ..., 39013, 38400, 38638],
              [40619, 40552, 40477, ..., 39063, 38525, 38720]],
      
             [[27883, 27516, 27419, ..., 27262, 27540, 27496],
              [28258, 27992, 27933, ..., 27375, 27510, 27662],
              [28512, 28341, 28389, ..., 27546, 27623, 27823],
              ...,
              [32777, 32921, 33019, ..., 14052, 13155, 12833],
              [32716, 32814, 32816, ..., 14774, 14331, 15794],
              [32469, 32766, 32885, ..., 17086, 17268, 18448]],
      
             [[ 7595,  7737,  7723, ...,  9038,  9145,  9752],
              [ 7578,  7745,  7688, ...,  8938,  8900,  9341],
              [ 7598,  7768,  7749, ...,  8826,  8725,  8945],
              ...,
              [10741, 11074, 11619, ...,  9825,  9719,  9561],
              [11657, 11585, 11661, ..., 10663, 10340,  9900],
              [11445, 11325, 11708, ..., 10336, 10392, 10318]]], dtype=uint16)
    • green
      (time, y, x)
      uint16
      36410 36469 36214 ... 11640 11518
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[36410, 36469, 36214, ..., 42603, 42349, 42321],
              [36123, 36094, 35915, ..., 42559, 42260, 42164],
              [35816, 35769, 35624, ..., 42484, 42263, 42119],
              ...,
              [37460, 37698, 38115, ..., 41499, 41619, 41638],
              [37501, 37594, 37906, ..., 41446, 41559, 41580],
              [37688, 37538, 37694, ..., 41419, 41545, 41591]],
      
             [[26880, 26742, 27191, ..., 26317, 25813, 25896],
              [27010, 27152, 27463, ..., 26746, 26854, 26917],
              [26911, 27361, 27542, ..., 27032, 27426, 28019],
              ...,
              [32855, 33040, 33127, ..., 35255, 35011, 34073],
              [32932, 33126, 33171, ..., 35121, 34809, 33779],
              [32954, 33120, 33199, ..., 34639, 34448, 33617]],
      
             [[21965, 22629, 23283, ..., 11160, 12365, 11911],
              [22171, 22579, 22940, ..., 12091, 13429, 12904],
              [22658, 22750, 22980, ..., 13804, 14210, 12850],
              ...,
      ...
              ...,
              [39407, 39236, 39170, ..., 37291, 36989, 37393],
              [39366, 39209, 39091, ..., 37847, 37163, 37435],
              [39254, 39214, 39120, ..., 38019, 37290, 37482]],
      
             [[27241, 26945, 26794, ..., 27167, 27412, 27497],
              [27532, 27349, 27264, ..., 27347, 27502, 27661],
              [27743, 27616, 27676, ..., 27518, 27562, 27820],
              ...,
              [31765, 31918, 32035, ..., 15309, 14685, 14379],
              [31685, 31843, 31894, ..., 15810, 15522, 16681],
              [31427, 31784, 31960, ..., 17692, 18021, 19118]],
      
             [[ 7736,  7941,  7892, ..., 10526, 10736, 11328],
              [ 7671,  7949,  7898, ..., 10363, 10295, 10738],
              [ 7715,  7950,  7940, ..., 10119,  9943, 10234],
              ...,
              [12354, 12793, 13603, ..., 11697, 11258, 10662],
              [13889, 13729, 13698, ..., 12433, 11739, 11010],
              [13353, 13147, 13381, ..., 11842, 11640, 11518]]], dtype=uint16)
    • red
      (time, y, x)
      uint16
      36244 36332 36158 ... 12926 12585
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[36244, 36332, 36158, ..., 42967, 42709, 42677],
              [35998, 35979, 35849, ..., 42915, 42607, 42484],
              [35682, 35681, 35549, ..., 42836, 42604, 42447],
              ...,
              [37428, 37620, 37992, ..., 41429, 41547, 41576],
              [37429, 37471, 37771, ..., 41378, 41484, 41530],
              [37592, 37433, 37576, ..., 41362, 41495, 41545]],
      
             [[27213, 27095, 27458, ..., 27127, 26703, 26761],
              [27234, 27456, 27736, ..., 27544, 27656, 27706],
              [27176, 27612, 27821, ..., 27831, 28215, 28790],
              ...,
              [33025, 33207, 33306, ..., 35669, 35493, 34558],
              [33113, 33302, 33345, ..., 35562, 35296, 34287],
              [33108, 33301, 33399, ..., 35095, 34913, 34089]],
      
             [[22172, 22821, 23567, ..., 12047, 13341, 12884],
              [22471, 22814, 23173, ..., 13060, 14513, 13759],
              [22871, 22966, 23204, ..., 15070, 15551, 13813],
              ...,
      ...
              ...,
              [39517, 39335, 39280, ..., 37443, 37083, 37425],
              [39477, 39303, 39177, ..., 38046, 37320, 37467],
              [39367, 39325, 39212, ..., 38208, 37458, 37509]],
      
             [[27331, 27030, 26943, ..., 27857, 28134, 28228],
              [27621, 27446, 27380, ..., 28047, 28222, 28404],
              [27805, 27707, 27759, ..., 28244, 28278, 28527],
              ...,
              [31877, 32021, 32172, ..., 16694, 16047, 15613],
              [31821, 31970, 32045, ..., 17214, 16910, 17848],
              [31564, 31911, 32107, ..., 18917, 19166, 20199]],
      
             [[ 7419,  7624,  7592, ..., 11435, 11673, 12374],
              [ 7382,  7641,  7612, ..., 11287, 11148, 11619],
              [ 7438,  7670,  7660, ..., 11009, 10734, 10992],
              ...,
              [13190, 13732, 14626, ..., 12881, 12395, 11698],
              [15063, 14820, 14769, ..., 13878, 13039, 12081],
              [14407, 14330, 14429, ..., 13181, 12926, 12585]]], dtype=uint16)
    • nir08
      (time, y, x)
      uint16
      35466 35587 35444 ... 14630 14254
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[35466, 35587, 35444, ..., 42378, 42091, 42058],
              [35230, 35267, 35165, ..., 42307, 41987, 41873],
              [35002, 35009, 34904, ..., 42219, 41975, 41824],
              ...,
              [36732, 36868, 37218, ..., 40664, 40748, 40799],
              [36761, 36768, 37030, ..., 40594, 40696, 40692],
              [36954, 36731, 36816, ..., 40546, 40676, 40735]],
      
             [[27594, 27506, 27772, ..., 28259, 27895, 27930],
              [27587, 27791, 28031, ..., 28625, 28733, 28775],
              [27553, 27958, 28127, ..., 28833, 29154, 29678],
              ...,
              [33236, 33432, 33540, ..., 35916, 35857, 34974],
              [33344, 33525, 33611, ..., 35847, 35671, 34724],
              [33376, 33562, 33693, ..., 35497, 35312, 34544]],
      
             [[22723, 23366, 24027, ..., 13848, 15247, 14713],
              [22982, 23226, 23594, ..., 15433, 16732, 15662],
              [23311, 23433, 23648, ..., 17551, 17903, 16019],
              ...,
      ...
              ...,
              [39312, 39144, 39070, ..., 37289, 36803, 37016],
              [39289, 39109, 38957, ..., 37971, 37137, 37100],
              [39126, 39097, 39011, ..., 38042, 37248, 37216]],
      
             [[27448, 27189, 27100, ..., 28570, 28855, 28957],
              [27715, 27515, 27468, ..., 28771, 28937, 29108],
              [27850, 27741, 27765, ..., 28952, 28982, 29155],
              ...,
              [31830, 31993, 32139, ..., 18652, 17989, 17672],
              [31745, 31956, 32101, ..., 18990, 18687, 19613],
              [31607, 31942, 32134, ..., 20570, 20826, 21736]],
      
             [[ 7411,  7621,  7587, ..., 12709, 13032, 13900],
              [ 7416,  7629,  7596, ..., 12528, 12372, 12974],
              [ 7467,  7665,  7659, ..., 12175, 11944, 12296],
              ...,
              [15364, 16694, 17527, ..., 14461, 13810, 13094],
              [16789, 16667, 16583, ..., 15325, 14520, 13480],
              [15983, 15927, 16180, ..., 14604, 14630, 14254]]], dtype=uint16)
    • swir16
      (time, y, x)
      uint16
      28398 28591 28442 ... 17175 16479
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[28398, 28591, 28442, ..., 30118, 29786, 29930],
              [28186, 28312, 28221, ..., 30032, 29642, 29664],
              [27996, 28111, 27998, ..., 29934, 29623, 29559],
              ...,
              [26532, 26787, 27321, ..., 28380, 28445, 28429],
              [26589, 26657, 27083, ..., 28320, 28394, 28388],
              [26866, 26650, 26779, ..., 28314, 28429, 28429]],
      
             [[25585, 25575, 25853, ..., 27726, 27537, 27653],
              [25560, 25737, 26064, ..., 27880, 28042, 28162],
              [25479, 25798, 26099, ..., 27908, 28165, 28640],
              ...,
              [27130, 27325, 27445, ..., 31175, 31137, 30369],
              [27198, 27411, 27499, ..., 31205, 31115, 30167],
              [27220, 27416, 27560, ..., 30964, 30899, 30002]],
      
             [[21116, 21673, 22074, ..., 16096, 17669, 17086],
              [21353, 21659, 21816, ..., 18212, 19505, 18243],
              [21713, 21765, 21869, ..., 20604, 21004, 18995],
              ...,
      ...
              ...,
              [27195, 27144, 27173, ..., 29233, 28733, 29158],
              [27270, 27144, 27104, ..., 29677, 28658, 29012],
              [27208, 27168, 27130, ..., 29842, 28746, 28930]],
      
             [[25201, 25042, 24912, ..., 27466, 27819, 28041],
              [25306, 25195, 25144, ..., 27753, 27978, 28168],
              [25311, 25255, 25276, ..., 27918, 27932, 28060],
              ...,
              [25740, 25893, 25977, ..., 19832, 19464, 19251],
              [25711, 25906, 25999, ..., 20070, 19858, 20683],
              [25561, 25819, 25995, ..., 21233, 21480, 22485]],
      
             [[ 7684,  7888,  7857, ..., 14840, 15341, 16171],
              [ 7623,  7915,  7891, ..., 14468, 14472, 15190],
              [ 7701,  7934,  7943, ..., 13984, 13812, 14275],
              ...,
              [15317, 15883, 16667, ..., 17774, 17137, 16249],
              [16749, 16633, 16853, ..., 18817, 17565, 16250],
              [16369, 16634, 17149, ..., 17902, 17175, 16479]]], dtype=uint16)
    • swir22
      (time, y, x)
      uint16
      22729 22865 22740 ... 16075 15613
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[22729, 22865, 22740, ..., 23923, 23630, 23752],
              [22547, 22652, 22581, ..., 23855, 23479, 23526],
              [22403, 22502, 22400, ..., 23803, 23465, 23435],
              ...,
              [19875, 20162, 20651, ..., 21104, 21147, 21137],
              [19995, 20112, 20494, ..., 21035, 21088, 21086],
              [20260, 20105, 20246, ..., 21030, 21133, 21130]],
      
             [[23087, 23047, 23386, ..., 24706, 24494, 24657],
              [23113, 23248, 23603, ..., 24686, 24841, 24981],
              [22987, 23298, 23686, ..., 24578, 24823, 25309],
              ...,
              [21674, 21851, 21986, ..., 26707, 26704, 25920],
              [21728, 21932, 22043, ..., 26733, 26609, 25669],
              [21733, 21950, 22121, ..., 26463, 26376, 25570]],
      
             [[18834, 19252, 19536, ..., 14340, 15512, 15108],
              [18976, 19218, 19386, ..., 16206, 17182, 16089],
              [19309, 19368, 19461, ..., 18452, 18689, 16816],
              ...,
      ...
              ...,
              [19865, 19863, 19945, ..., 24200, 23640, 24208],
              [19953, 19869, 19893, ..., 24592, 23464, 23959],
              [19911, 19911, 19909, ..., 24582, 23516, 23859]],
      
             [[22663, 22529, 22469, ..., 24466, 24825, 25005],
              [22687, 22592, 22582, ..., 24670, 24874, 25056],
              [22648, 22628, 22672, ..., 24729, 24750, 24913],
              ...,
              [20498, 20591, 20677, ..., 18207, 17885, 17736],
              [20509, 20655, 20684, ..., 18534, 18497, 19309],
              [20409, 20634, 20726, ..., 19764, 20128, 21186]],
      
             [[ 7699,  7868,  7829, ..., 13641, 14192, 14936],
              [ 7650,  7887,  7858, ..., 13301, 13382, 13984],
              [ 7714,  7902,  7900, ..., 12764, 12745, 13200],
              ...,
              [14362, 14604, 15240, ..., 16377, 15933, 15382],
              [15570, 15444, 15873, ..., 17364, 16321, 15362],
              [15439, 15811, 16577, ..., 16694, 16075, 15613]]], dtype=uint16)
    • qa_pixel
      (time, y, x)
      uint16
      22280 22280 22280 ... 21824 21824
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'snow': {'bits': 5, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'clear': {'bits': 6, 'values': {'0': 'not_clear', '1': 'clear'}}, 'cloud': {'bits': 3, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'water': {'bits': 7, 'values': {'0': 'land_or_cloud', '1': 'water'}}, 'cirrus': {'bits': 2, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'nodata': {'bits': 0, 'values': {'0': False, '1': True}}, 'qa_pixel': {'bits': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15], 'values': {'1': 'Fill', '2': 'Dilated Cloud', '4': 'Cirrus', '8': 'Cloud', '16': 'Cloud Shadow', '32': 'Snow', '64': 'Clear', '128': 'Water', '256': 'Cloud Confidence low bit', '512': 'Cloud Confidence high bit', '1024': 'Cloud Shadow Confidence low bit', '2048': 'Cloud Shadow Confidence high bit', '4096': 'Snow Ice Confidence low bit', '8192': 'Snow Ice Confidence high bit', '16384': 'Cirrus Confidence low bit', '32768': 'Cirrus Confidence high bit'}, 'description': 'Level 2 pixel quality'}, 'cloud_shadow': {'bits': 4, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'dilated_cloud': {'bits': 1, 'values': {'0': 'not_dilated', '1': 'dilated'}}, 'cloud_confidence': {'bits': [8, 9], 'values': {'0': 'none', '1': 'low', '2': 'medium', '3': 'high'}}, 'cirrus_confidence': {'bits': [14, 15], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}, 'snow_ice_confidence': {'bits': [12, 13], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}, 'cloud_shadow_confidence': {'bits': [10, 11], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              ...,
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280]],
      
             [[22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              ...,
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280]],
      
             [[22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              ...,
      ...
              ...,
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280]],
      
             [[22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              ...,
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280]],
      
             [[21952, 21952, 21952, ..., 23826, 23826, 22280],
              [21952, 21952, 21952, ..., 23826, 23826, 23826],
              [21952, 21952, 21952, ..., 23826, 23826, 23826],
              ...,
              [22280, 23826, 23826, ..., 21824, 21824, 21824],
              [23826, 23826, 23826, ..., 21824, 21824, 21824],
              [23826, 23826, 21762, ..., 21824, 21824, 21824]]], dtype=uint16)
    • qa_aerosol
      (time, y, x)
      uint8
      224 206 220 224 ... 160 106 96 96
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'water': {'bits': 2, 'values': {'0': 'not_water', '1': 'water'}}, 'nodata': {'bits': 0, 'values': {'0': False, '1': True}}, 'qa_aerosol': {'bits': [0, 1, 2, 3, 4, 5, 6, 7], 'values': {'1': 'Fill', '2': 'Valid aerosol retrieval', '4': 'Water', '8': 'Unused', '16': 'Unused', '32': 'Interpolated Aerosol', '64': 'Aerosol Level low bit', '128': 'Aerosol Level high bit'}, 'description': 'Aerosol quality assessment'}, 'aerosol_level': {'bits': [6, 7], 'values': {'0': 'climatology', '1': 'low', '2': 'medium', '3': 'high'}}, 'valid_retrieval': {'bits': 1, 'values': {'0': 'not_valid', '1': 'valid'}}, 'interp_retrieval': {'bits': 5, 'values': {'0': 'not_aerosol_interpolated', '1': 'aerosol_interpolated'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[224, 206, 220, ..., 208, 224, 223],
              [224, 216, 222, ..., 224, 224, 224],
              [224, 224, 224, ..., 212, 224, 222],
              ...,
              [224, 224, 224, ..., 224, 213, 220],
              [212, 224, 209, ..., 224, 211, 220],
              [223, 224, 219, ..., 224, 224, 224]],
      
             [[223, 224, 215, ..., 224, 224, 224],
              [224, 224, 224, ..., 222, 212, 224],
              [222, 224, 205, ..., 221, 208, 224],
              ...,
              [208, 212, 224, ..., 220, 224, 210],
              [222, 222, 224, ..., 224, 224, 224],
              [224, 224, 224, ..., 221, 224, 213]],
      
             [[103, 151, 161, ..., 223, 207, 224],
              [137, 157, 151, ..., 224, 224, 224],
              [159, 160, 160, ..., 222, 214, 224],
              ...,
      ...
              ...,
              [224, 224, 224, ..., 220, 224, 214],
              [207, 213, 224, ..., 221, 224, 209],
              [222, 222, 224, ..., 224, 224, 224]],
      
             [[219, 224, 207, ..., 223, 206, 224],
              [223, 224, 215, ..., 224, 224, 224],
              [224, 224, 224, ..., 222, 213, 224],
              ...,
              [224, 224, 224, ..., 220, 224, 214],
              [207, 213, 224, ..., 221, 224, 189],
              [222, 222, 224, ..., 196, 194, 166]],
      
             [[205, 221, 225, ..., 224, 223, 205],
              [218, 223, 224, ..., 224, 224, 224],
              [224, 224, 224, ..., 195, 197, 197],
              ...,
              [224, 224, 224, ...,  93,  92,  96],
              [224, 209, 214, ...,  98,  93,  96],
              [224, 221, 222, ..., 106,  96,  96]]], dtype=uint8)
    • qa_radsat
      (time, y, x)
      uint16
      0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'qa_radsat': {'bits': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], 'values': {'1': 'Band 1 Data Saturation', '2': 'Band 2 Data Saturation', '4': 'Band 3 Data Saturation', '8': 'Band 4 Data Saturation', '16': 'Band 5 Data Saturation', '32': 'Band 6 Data Saturation', '64': 'Band 7 Data Saturation', '128': 'Unused', '256': 'Band 9 Data Saturation', '512': 'Unused', '1024': 'Unused', '2048': 'Terrain occlusion'}, 'description': 'Radiometric saturation'}, 'b1_saturation': {'bits': 0, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b2_saturation': {'bits': 1, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b3_saturation': {'bits': 2, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b4_saturation': {'bits': 3, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b5_saturation': {'bits': 4, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b6_saturation': {'bits': 5, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b7_saturation': {'bits': 6, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b9_saturation': {'bits': 8, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'terrain_occlusion': {'bits': 11, 'values': {'0': 'no_terrain_occlusion', '1': 'terrain_occlusion'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]],
      
             [[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]],
      
             [[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
      ...
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]],
      
             [[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]],
      
             [[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]]], dtype=uint16)
    • time
      PandasIndex
      PandasIndex(DatetimeIndex(['2021-01-03 14:39:19.317361', '2021-01-19 14:39:11.996195',
                     '2021-02-04 14:39:10.642566', '2021-02-20 14:39:06.231918',
                     '2021-03-08 14:38:58.547361', '2021-03-24 14:38:51.737798',
                     '2021-04-09 14:38:47.161063', '2021-04-25 14:38:39.521240',
                     '2021-05-11 14:38:36.106063', '2021-05-27 14:38:45.945207',
                     '2021-06-12 14:38:52.799593', '2021-06-28 14:38:56.818329',
                     '2021-07-14 14:38:58.020893', '2021-07-30 14:39:06.209510',
                     '2021-08-31 14:39:16.967976', '2021-09-16 14:39:20.745598',
                     '2021-10-02 14:39:25.506402', '2021-10-18 14:39:29.217991',
                     '2021-11-03 14:39:28.562838', '2021-11-19 14:39:23.470160',
                     '2021-12-05 14:39:25.260483', '2021-12-21 14:39:22.530622'],
                    dtype='datetime64[ns]', name='time', freq=None))
    • y
      PandasIndex
      PandasIndex(Float64Index([6692085.0, 6692055.0, 6692025.0, 6691995.0, 6691965.0, 6691935.0,
                    6691905.0, 6691875.0, 6691845.0, 6691815.0,
                    ...
                    6680955.0, 6680925.0, 6680895.0, 6680865.0, 6680835.0, 6680805.0,
                    6680775.0, 6680745.0, 6680715.0, 6680685.0],
                   dtype='float64', name='y', length=381))
    • x
      PandasIndex
      PandasIndex(Float64Index([857175.0, 857205.0, 857235.0, 857265.0, 857295.0, 857325.0,
                    857355.0, 857385.0, 857415.0, 857445.0,
                    ...
                    866925.0, 866955.0, 866985.0, 867015.0, 867045.0, 867075.0,
                    867105.0, 867135.0, 867165.0, 867195.0],
                   dtype='float64', name='x', length=335))
  • crs :
    epsg:32718
    grid_mapping :
    spatial_ref

There should now be around 40 or more time observations and in the order of 3-5 minutes to load.

Let's enable dask and repeat the load. We're chunking by time (length one) so dask will be able to load each time slice in parallel. The data variables are also independent so will be done in parallel as well.

In [22]:
chunks = {"time":1}
In [23]:
%%time
dataset = None # clear results from any previous runs
dataset = dc.load(
            product=product,
            x=study_area_lon,
            y=study_area_lat,
            time=set_time,
            measurements=measurements,
            resampling={"qa_pixel": "nearest", "*": "average"},
            output_crs=set_crs,
            resolution=set_resolution,
            dask_chunks = chunks, ###### THIS IS THE ONLY LINE CHANGED. #####
            group_by=group_by,
        )
dataset
CPU times: user 66.6 ms, sys: 6.84 ms, total: 73.4 ms
Wall time: 107 ms
Out[23]:
<xarray.Dataset>
Dimensions:      (time: 22, y: 381, x: 335)
Coordinates:
  * time         (time) datetime64[ns] 2021-01-03T14:39:19.317361 ... 2021-12...
  * y            (y) float64 6.692e+06 6.692e+06 ... 6.681e+06 6.681e+06
  * x            (x) float64 8.572e+05 8.572e+05 ... 8.672e+05 8.672e+05
    spatial_ref  int32 32718
Data variables:
    coastal      (time, y, x) uint16 dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
    blue         (time, y, x) uint16 dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
    green        (time, y, x) uint16 dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
    red          (time, y, x) uint16 dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
    nir08        (time, y, x) uint16 dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
    swir16       (time, y, x) uint16 dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
    swir22       (time, y, x) uint16 dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
    qa_pixel     (time, y, x) uint16 dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
    qa_aerosol   (time, y, x) uint8 dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
    qa_radsat    (time, y, x) uint16 dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
Attributes:
    crs:           epsg:32718
    grid_mapping:  spatial_ref
xarray.Dataset
    • time: 22
    • y: 381
    • x: 335
    • time
      (time)
      datetime64[ns]
      2021-01-03T14:39:19.317361 ... 2...
      units :
      seconds since 1970-01-01 00:00:00
      array(['2021-01-03T14:39:19.317361000', '2021-01-19T14:39:11.996195000',
             '2021-02-04T14:39:10.642566000', '2021-02-20T14:39:06.231918000',
             '2021-03-08T14:38:58.547361000', '2021-03-24T14:38:51.737798000',
             '2021-04-09T14:38:47.161063000', '2021-04-25T14:38:39.521240000',
             '2021-05-11T14:38:36.106063000', '2021-05-27T14:38:45.945207000',
             '2021-06-12T14:38:52.799593000', '2021-06-28T14:38:56.818329000',
             '2021-07-14T14:38:58.020893000', '2021-07-30T14:39:06.209510000',
             '2021-08-31T14:39:16.967976000', '2021-09-16T14:39:20.745598000',
             '2021-10-02T14:39:25.506402000', '2021-10-18T14:39:29.217991000',
             '2021-11-03T14:39:28.562838000', '2021-11-19T14:39:23.470160000',
             '2021-12-05T14:39:25.260483000', '2021-12-21T14:39:22.530622000'],
            dtype='datetime64[ns]')
    • y
      (y)
      float64
      6.692e+06 6.692e+06 ... 6.681e+06
      units :
      metre
      resolution :
      -30.0
      crs :
      epsg:32718
      array([6692085., 6692055., 6692025., ..., 6680745., 6680715., 6680685.])
    • x
      (x)
      float64
      8.572e+05 8.572e+05 ... 8.672e+05
      units :
      metre
      resolution :
      30.0
      crs :
      epsg:32718
      array([857175., 857205., 857235., ..., 867135., 867165., 867195.])
    • spatial_ref
      ()
      int32
      32718
      spatial_ref :
      PROJCS["WGS 84 / UTM zone 18S",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-75],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",10000000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","32718"]]
      grid_mapping_name :
      transverse_mercator
      array(32718, dtype=int32)
    • coastal
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 5.36 MiB 249.29 kiB
      Shape (22, 381, 335) (1, 381, 335)
      Dask graph 22 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      335 381 22
    • blue
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 5.36 MiB 249.29 kiB
      Shape (22, 381, 335) (1, 381, 335)
      Dask graph 22 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      335 381 22
    • green
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 5.36 MiB 249.29 kiB
      Shape (22, 381, 335) (1, 381, 335)
      Dask graph 22 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      335 381 22
    • red
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 5.36 MiB 249.29 kiB
      Shape (22, 381, 335) (1, 381, 335)
      Dask graph 22 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      335 381 22
    • nir08
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 5.36 MiB 249.29 kiB
      Shape (22, 381, 335) (1, 381, 335)
      Dask graph 22 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      335 381 22
    • swir16
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 5.36 MiB 249.29 kiB
      Shape (22, 381, 335) (1, 381, 335)
      Dask graph 22 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      335 381 22
    • swir22
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 5.36 MiB 249.29 kiB
      Shape (22, 381, 335) (1, 381, 335)
      Dask graph 22 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      335 381 22
    • qa_pixel
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'snow': {'bits': 5, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'clear': {'bits': 6, 'values': {'0': 'not_clear', '1': 'clear'}}, 'cloud': {'bits': 3, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'water': {'bits': 7, 'values': {'0': 'land_or_cloud', '1': 'water'}}, 'cirrus': {'bits': 2, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'nodata': {'bits': 0, 'values': {'0': False, '1': True}}, 'qa_pixel': {'bits': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15], 'values': {'1': 'Fill', '2': 'Dilated Cloud', '4': 'Cirrus', '8': 'Cloud', '16': 'Cloud Shadow', '32': 'Snow', '64': 'Clear', '128': 'Water', '256': 'Cloud Confidence low bit', '512': 'Cloud Confidence high bit', '1024': 'Cloud Shadow Confidence low bit', '2048': 'Cloud Shadow Confidence high bit', '4096': 'Snow Ice Confidence low bit', '8192': 'Snow Ice Confidence high bit', '16384': 'Cirrus Confidence low bit', '32768': 'Cirrus Confidence high bit'}, 'description': 'Level 2 pixel quality'}, 'cloud_shadow': {'bits': 4, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'dilated_cloud': {'bits': 1, 'values': {'0': 'not_dilated', '1': 'dilated'}}, 'cloud_confidence': {'bits': [8, 9], 'values': {'0': 'none', '1': 'low', '2': 'medium', '3': 'high'}}, 'cirrus_confidence': {'bits': [14, 15], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}, 'snow_ice_confidence': {'bits': [12, 13], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}, 'cloud_shadow_confidence': {'bits': [10, 11], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 5.36 MiB 249.29 kiB
      Shape (22, 381, 335) (1, 381, 335)
      Dask graph 22 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      335 381 22
    • qa_aerosol
      (time, y, x)
      uint8
      dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'water': {'bits': 2, 'values': {'0': 'not_water', '1': 'water'}}, 'nodata': {'bits': 0, 'values': {'0': False, '1': True}}, 'qa_aerosol': {'bits': [0, 1, 2, 3, 4, 5, 6, 7], 'values': {'1': 'Fill', '2': 'Valid aerosol retrieval', '4': 'Water', '8': 'Unused', '16': 'Unused', '32': 'Interpolated Aerosol', '64': 'Aerosol Level low bit', '128': 'Aerosol Level high bit'}, 'description': 'Aerosol quality assessment'}, 'aerosol_level': {'bits': [6, 7], 'values': {'0': 'climatology', '1': 'low', '2': 'medium', '3': 'high'}}, 'valid_retrieval': {'bits': 1, 'values': {'0': 'not_valid', '1': 'valid'}}, 'interp_retrieval': {'bits': 5, 'values': {'0': 'not_aerosol_interpolated', '1': 'aerosol_interpolated'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 2.68 MiB 124.64 kiB
      Shape (22, 381, 335) (1, 381, 335)
      Dask graph 22 chunks in 1 graph layer
      Data type uint8 numpy.ndarray
      335 381 22
    • qa_radsat
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 381, 335), meta=np.ndarray>
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'qa_radsat': {'bits': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], 'values': {'1': 'Band 1 Data Saturation', '2': 'Band 2 Data Saturation', '4': 'Band 3 Data Saturation', '8': 'Band 4 Data Saturation', '16': 'Band 5 Data Saturation', '32': 'Band 6 Data Saturation', '64': 'Band 7 Data Saturation', '128': 'Unused', '256': 'Band 9 Data Saturation', '512': 'Unused', '1024': 'Unused', '2048': 'Terrain occlusion'}, 'description': 'Radiometric saturation'}, 'b1_saturation': {'bits': 0, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b2_saturation': {'bits': 1, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b3_saturation': {'bits': 2, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b4_saturation': {'bits': 3, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b5_saturation': {'bits': 4, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b6_saturation': {'bits': 5, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b7_saturation': {'bits': 6, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b9_saturation': {'bits': 8, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'terrain_occlusion': {'bits': 11, 'values': {'0': 'no_terrain_occlusion', '1': 'terrain_occlusion'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 5.36 MiB 249.29 kiB
      Shape (22, 381, 335) (1, 381, 335)
      Dask graph 22 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      335 381 22
    • time
      PandasIndex
      PandasIndex(DatetimeIndex(['2021-01-03 14:39:19.317361', '2021-01-19 14:39:11.996195',
                     '2021-02-04 14:39:10.642566', '2021-02-20 14:39:06.231918',
                     '2021-03-08 14:38:58.547361', '2021-03-24 14:38:51.737798',
                     '2021-04-09 14:38:47.161063', '2021-04-25 14:38:39.521240',
                     '2021-05-11 14:38:36.106063', '2021-05-27 14:38:45.945207',
                     '2021-06-12 14:38:52.799593', '2021-06-28 14:38:56.818329',
                     '2021-07-14 14:38:58.020893', '2021-07-30 14:39:06.209510',
                     '2021-08-31 14:39:16.967976', '2021-09-16 14:39:20.745598',
                     '2021-10-02 14:39:25.506402', '2021-10-18 14:39:29.217991',
                     '2021-11-03 14:39:28.562838', '2021-11-19 14:39:23.470160',
                     '2021-12-05 14:39:25.260483', '2021-12-21 14:39:22.530622'],
                    dtype='datetime64[ns]', name='time', freq=None))
    • y
      PandasIndex
      PandasIndex(Float64Index([6692085.0, 6692055.0, 6692025.0, 6691995.0, 6691965.0, 6691935.0,
                    6691905.0, 6691875.0, 6691845.0, 6691815.0,
                    ...
                    6680955.0, 6680925.0, 6680895.0, 6680865.0, 6680835.0, 6680805.0,
                    6680775.0, 6680745.0, 6680715.0, 6680685.0],
                   dtype='float64', name='y', length=381))
    • x
      PandasIndex
      PandasIndex(Float64Index([857175.0, 857205.0, 857235.0, 857265.0, 857295.0, 857325.0,
                    857355.0, 857385.0, 857415.0, 857445.0,
                    ...
                    866925.0, 866955.0, 866985.0, 867015.0, 867045.0, 867075.0,
                    867105.0, 867135.0, 867165.0, 867195.0],
                   dtype='float64', name='x', length=335))
  • crs :
    epsg:32718
    grid_mapping :
    spatial_ref

Woah!! that was fast - but we didn't actually compute anything so no load has occurred and all tasks are pending. Open up the Data Variables, click the stacked cylinders and take a look at the delayed task counts. These exist for every variable.

Let's visualise the task graph for the red band.

In [24]:
dataset.red.data.visualize()
Out[24]:

Well that's not as useful, is it!

You should just be able to make out that each of the chunks are able to independently load(). time chunk is length 1 so these are individual times. This holds true for all the bands so dask can spread these out across multiple threads.

Tip: Visualising task graphs is less effective as your task graph complexity increases. You may need to use simpler examples to see what is going on.

Let's get the actual data

In [25]:
%%time
actual_dataset = dataset.compute()
actual_dataset
/env/lib/python3.10/site-packages/rasterio/warp.py:344: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
  _reproject(
/env/lib/python3.10/site-packages/rasterio/warp.py:344: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
  _reproject(
CPU times: user 470 ms, sys: 162 ms, total: 632 ms
Wall time: 7.69 s
Out[25]:
<xarray.Dataset>
Dimensions:      (time: 22, y: 381, x: 335)
Coordinates:
  * time         (time) datetime64[ns] 2021-01-03T14:39:19.317361 ... 2021-12...
  * y            (y) float64 6.692e+06 6.692e+06 ... 6.681e+06 6.681e+06
  * x            (x) float64 8.572e+05 8.572e+05 ... 8.672e+05 8.672e+05
    spatial_ref  int32 32718
Data variables:
    coastal      (time, y, x) uint16 38342 38311 37994 37453 ... 9836 9832 9814
    blue         (time, y, x) uint16 38117 38116 37811 ... 10336 10392 10318
    green        (time, y, x) uint16 36410 36469 36214 ... 11842 11640 11518
    red          (time, y, x) uint16 36244 36332 36158 ... 13181 12926 12585
    nir08        (time, y, x) uint16 35466 35587 35444 ... 14604 14630 14254
    swir16       (time, y, x) uint16 28398 28591 28442 ... 17902 17175 16479
    swir22       (time, y, x) uint16 22729 22865 22740 ... 16694 16075 15613
    qa_pixel     (time, y, x) uint16 22280 22280 22280 ... 21824 21824 21824
    qa_aerosol   (time, y, x) uint8 224 206 220 224 210 ... 166 160 106 96 96
    qa_radsat    (time, y, x) uint16 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
Attributes:
    crs:           epsg:32718
    grid_mapping:  spatial_ref
xarray.Dataset
    • time: 22
    • y: 381
    • x: 335
    • time
      (time)
      datetime64[ns]
      2021-01-03T14:39:19.317361 ... 2...
      units :
      seconds since 1970-01-01 00:00:00
      array(['2021-01-03T14:39:19.317361000', '2021-01-19T14:39:11.996195000',
             '2021-02-04T14:39:10.642566000', '2021-02-20T14:39:06.231918000',
             '2021-03-08T14:38:58.547361000', '2021-03-24T14:38:51.737798000',
             '2021-04-09T14:38:47.161063000', '2021-04-25T14:38:39.521240000',
             '2021-05-11T14:38:36.106063000', '2021-05-27T14:38:45.945207000',
             '2021-06-12T14:38:52.799593000', '2021-06-28T14:38:56.818329000',
             '2021-07-14T14:38:58.020893000', '2021-07-30T14:39:06.209510000',
             '2021-08-31T14:39:16.967976000', '2021-09-16T14:39:20.745598000',
             '2021-10-02T14:39:25.506402000', '2021-10-18T14:39:29.217991000',
             '2021-11-03T14:39:28.562838000', '2021-11-19T14:39:23.470160000',
             '2021-12-05T14:39:25.260483000', '2021-12-21T14:39:22.530622000'],
            dtype='datetime64[ns]')
    • y
      (y)
      float64
      6.692e+06 6.692e+06 ... 6.681e+06
      units :
      metre
      resolution :
      -30.0
      crs :
      epsg:32718
      array([6692085., 6692055., 6692025., ..., 6680745., 6680715., 6680685.])
    • x
      (x)
      float64
      8.572e+05 8.572e+05 ... 8.672e+05
      units :
      metre
      resolution :
      30.0
      crs :
      epsg:32718
      array([857175., 857205., 857235., ..., 867135., 867165., 867195.])
    • spatial_ref
      ()
      int32
      32718
      spatial_ref :
      PROJCS["WGS 84 / UTM zone 18S",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-75],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",10000000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","32718"]]
      grid_mapping_name :
      transverse_mercator
      array(32718, dtype=int32)
    • coastal
      (time, y, x)
      uint16
      38342 38311 37994 ... 9832 9814
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[38342, 38311, 37994, ..., 44083, 43955, 43934],
              [38034, 37992, 37796, ..., 44049, 43865, 43804],
              [37705, 37644, 37454, ..., 44028, 43837, 43745],
              ...,
              [39221, 39599, 40046, ..., 43182, 43265, 43277],
              [39230, 39480, 39864, ..., 43105, 43200, 43240],
              [39448, 39398, 39575, ..., 43075, 43200, 43245]],
      
             [[27076, 26975, 27310, ..., 25533, 24978, 24973],
              [27157, 27323, 27590, ..., 26099, 26176, 26102],
              [27120, 27594, 27725, ..., 26409, 26891, 27563],
              ...,
              [33571, 33770, 33852, ..., 35482, 35125, 34172],
              [33678, 33867, 33903, ..., 35285, 34891, 33944],
              [33660, 33857, 33919, ..., 34770, 34488, 33868]],
      
             [[22430, 23160, 23876, ...,  7597,  8801,  8342],
              [22561, 22890, 23277, ...,  7435,  9220,  8240],
              [22800, 23003, 23224, ..., 10056, 10611,  7981],
              ...,
      ...
              ...,
              [40494, 40391, 40363, ..., 38437, 38234, 38675],
              [40572, 40452, 40365, ..., 38796, 38274, 38645],
              [40524, 40472, 40389, ..., 38892, 38439, 38663]],
      
             [[27891, 27549, 27452, ..., 27017, 27318, 27313],
              [28239, 28003, 27994, ..., 27182, 27370, 27468],
              [28520, 28420, 28530, ..., 27367, 27477, 27659],
              ...,
              [32798, 32936, 32995, ..., 13854, 13077, 12826],
              [32719, 32817, 32796, ..., 14677, 14220, 15676],
              [32457, 32784, 32884, ..., 16920, 17017, 18179]],
      
             [[ 7442,  7587,  7582, ...,  8433,  8483,  8972],
              [ 7443,  7609,  7562, ...,  8409,  8387,  8739],
              [ 7462,  7632,  7596, ...,  8375,  8239,  8454],
              ...,
              [ 9772, 10100, 10620, ...,  9249,  9296,  9214],
              [10608, 10539, 10712, ..., 10035,  9847,  9545],
              [10435, 10416, 10757, ...,  9836,  9832,  9814]]], dtype=uint16)
    • blue
      (time, y, x)
      uint16
      38117 38116 37811 ... 10392 10318
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[38117, 38116, 37811, ..., 44195, 44015, 43985],
              [37834, 37739, 37606, ..., 44154, 43935, 43845],
              [37501, 37389, 37268, ..., 44109, 43910, 43793],
              ...,
              [39101, 39429, 39853, ..., 43080, 43194, 43213],
              [39131, 39336, 39679, ..., 43039, 43139, 43150],
              [39295, 39238, 39424, ..., 43014, 43109, 43154]],
      
             [[27168, 27101, 27438, ..., 25928, 25379, 25442],
              [27305, 27505, 27710, ..., 26476, 26598, 26593],
              [27300, 27777, 27837, ..., 26754, 27253, 27887],
              ...,
              [33676, 33864, 33973, ..., 35803, 35529, 34526],
              [33757, 33941, 34012, ..., 35601, 35243, 34280],
              [33756, 33950, 34037, ..., 35154, 34816, 34224]],
      
             [[22372, 23106, 23847, ...,  8739,  9908,  9477],
              [22346, 22689, 23262, ...,  8539, 10370,  9571],
              [22770, 22982, 23282, ..., 10638, 11391,  9186],
              ...,
      ...
              ...,
              [40649, 40525, 40472, ..., 38548, 38228, 38640],
              [40698, 40530, 40427, ..., 39013, 38400, 38638],
              [40619, 40552, 40477, ..., 39063, 38525, 38720]],
      
             [[27883, 27516, 27419, ..., 27262, 27540, 27496],
              [28258, 27992, 27933, ..., 27375, 27510, 27662],
              [28512, 28341, 28389, ..., 27546, 27623, 27823],
              ...,
              [32777, 32921, 33019, ..., 14052, 13155, 12833],
              [32716, 32814, 32816, ..., 14774, 14331, 15794],
              [32469, 32766, 32885, ..., 17086, 17268, 18448]],
      
             [[ 7595,  7737,  7723, ...,  9038,  9145,  9752],
              [ 7578,  7745,  7688, ...,  8938,  8900,  9341],
              [ 7598,  7768,  7749, ...,  8826,  8725,  8945],
              ...,
              [10741, 11074, 11619, ...,  9825,  9719,  9561],
              [11657, 11585, 11661, ..., 10663, 10340,  9900],
              [11445, 11325, 11708, ..., 10336, 10392, 10318]]], dtype=uint16)
    • green
      (time, y, x)
      uint16
      36410 36469 36214 ... 11640 11518
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[36410, 36469, 36214, ..., 42603, 42349, 42321],
              [36123, 36094, 35915, ..., 42559, 42260, 42164],
              [35816, 35769, 35624, ..., 42484, 42263, 42119],
              ...,
              [37460, 37698, 38115, ..., 41499, 41619, 41638],
              [37501, 37594, 37906, ..., 41446, 41559, 41580],
              [37688, 37538, 37694, ..., 41419, 41545, 41591]],
      
             [[26880, 26742, 27191, ..., 26317, 25813, 25896],
              [27010, 27152, 27463, ..., 26746, 26854, 26917],
              [26911, 27361, 27542, ..., 27032, 27426, 28019],
              ...,
              [32855, 33040, 33127, ..., 35255, 35011, 34073],
              [32932, 33126, 33171, ..., 35121, 34809, 33779],
              [32954, 33120, 33199, ..., 34639, 34448, 33617]],
      
             [[21965, 22629, 23283, ..., 11160, 12365, 11911],
              [22171, 22579, 22940, ..., 12091, 13429, 12904],
              [22658, 22750, 22980, ..., 13804, 14210, 12850],
              ...,
      ...
              ...,
              [39407, 39236, 39170, ..., 37291, 36989, 37393],
              [39366, 39209, 39091, ..., 37847, 37163, 37435],
              [39254, 39214, 39120, ..., 38019, 37290, 37482]],
      
             [[27241, 26945, 26794, ..., 27167, 27412, 27497],
              [27532, 27349, 27264, ..., 27347, 27502, 27661],
              [27743, 27616, 27676, ..., 27518, 27562, 27820],
              ...,
              [31765, 31918, 32035, ..., 15309, 14685, 14379],
              [31685, 31843, 31894, ..., 15810, 15522, 16681],
              [31427, 31784, 31960, ..., 17692, 18021, 19118]],
      
             [[ 7736,  7941,  7892, ..., 10526, 10736, 11328],
              [ 7671,  7949,  7898, ..., 10363, 10295, 10738],
              [ 7715,  7950,  7940, ..., 10119,  9943, 10234],
              ...,
              [12354, 12793, 13603, ..., 11697, 11258, 10662],
              [13889, 13729, 13698, ..., 12433, 11739, 11010],
              [13353, 13147, 13381, ..., 11842, 11640, 11518]]], dtype=uint16)
    • red
      (time, y, x)
      uint16
      36244 36332 36158 ... 12926 12585
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[36244, 36332, 36158, ..., 42967, 42709, 42677],
              [35998, 35979, 35849, ..., 42915, 42607, 42484],
              [35682, 35681, 35549, ..., 42836, 42604, 42447],
              ...,
              [37428, 37620, 37992, ..., 41429, 41547, 41576],
              [37429, 37471, 37771, ..., 41378, 41484, 41530],
              [37592, 37433, 37576, ..., 41362, 41495, 41545]],
      
             [[27213, 27095, 27458, ..., 27127, 26703, 26761],
              [27234, 27456, 27736, ..., 27544, 27656, 27706],
              [27176, 27612, 27821, ..., 27831, 28215, 28790],
              ...,
              [33025, 33207, 33306, ..., 35669, 35493, 34558],
              [33113, 33302, 33345, ..., 35562, 35296, 34287],
              [33108, 33301, 33399, ..., 35095, 34913, 34089]],
      
             [[22172, 22821, 23567, ..., 12047, 13341, 12884],
              [22471, 22814, 23173, ..., 13060, 14513, 13759],
              [22871, 22966, 23204, ..., 15070, 15551, 13813],
              ...,
      ...
              ...,
              [39517, 39335, 39280, ..., 37443, 37083, 37425],
              [39477, 39303, 39177, ..., 38046, 37320, 37467],
              [39367, 39325, 39212, ..., 38208, 37458, 37509]],
      
             [[27331, 27030, 26943, ..., 27857, 28134, 28228],
              [27621, 27446, 27380, ..., 28047, 28222, 28404],
              [27805, 27707, 27759, ..., 28244, 28278, 28527],
              ...,
              [31877, 32021, 32172, ..., 16694, 16047, 15613],
              [31821, 31970, 32045, ..., 17214, 16910, 17848],
              [31564, 31911, 32107, ..., 18917, 19166, 20199]],
      
             [[ 7419,  7624,  7592, ..., 11435, 11673, 12374],
              [ 7382,  7641,  7612, ..., 11287, 11148, 11619],
              [ 7438,  7670,  7660, ..., 11009, 10734, 10992],
              ...,
              [13190, 13732, 14626, ..., 12881, 12395, 11698],
              [15063, 14820, 14769, ..., 13878, 13039, 12081],
              [14407, 14330, 14429, ..., 13181, 12926, 12585]]], dtype=uint16)
    • nir08
      (time, y, x)
      uint16
      35466 35587 35444 ... 14630 14254
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[35466, 35587, 35444, ..., 42378, 42091, 42058],
              [35230, 35267, 35165, ..., 42307, 41987, 41873],
              [35002, 35009, 34904, ..., 42219, 41975, 41824],
              ...,
              [36732, 36868, 37218, ..., 40664, 40748, 40799],
              [36761, 36768, 37030, ..., 40594, 40696, 40692],
              [36954, 36731, 36816, ..., 40546, 40676, 40735]],
      
             [[27594, 27506, 27772, ..., 28259, 27895, 27930],
              [27587, 27791, 28031, ..., 28625, 28733, 28775],
              [27553, 27958, 28127, ..., 28833, 29154, 29678],
              ...,
              [33236, 33432, 33540, ..., 35916, 35857, 34974],
              [33344, 33525, 33611, ..., 35847, 35671, 34724],
              [33376, 33562, 33693, ..., 35497, 35312, 34544]],
      
             [[22723, 23366, 24027, ..., 13848, 15247, 14713],
              [22982, 23226, 23594, ..., 15433, 16732, 15662],
              [23311, 23433, 23648, ..., 17551, 17903, 16019],
              ...,
      ...
              ...,
              [39312, 39144, 39070, ..., 37289, 36803, 37016],
              [39289, 39109, 38957, ..., 37971, 37137, 37100],
              [39126, 39097, 39011, ..., 38042, 37248, 37216]],
      
             [[27448, 27189, 27100, ..., 28570, 28855, 28957],
              [27715, 27515, 27468, ..., 28771, 28937, 29108],
              [27850, 27741, 27765, ..., 28952, 28982, 29155],
              ...,
              [31830, 31993, 32139, ..., 18652, 17989, 17672],
              [31745, 31956, 32101, ..., 18990, 18687, 19613],
              [31607, 31942, 32134, ..., 20570, 20826, 21736]],
      
             [[ 7411,  7621,  7587, ..., 12709, 13032, 13900],
              [ 7416,  7629,  7596, ..., 12528, 12372, 12974],
              [ 7467,  7665,  7659, ..., 12175, 11944, 12296],
              ...,
              [15364, 16694, 17527, ..., 14461, 13810, 13094],
              [16789, 16667, 16583, ..., 15325, 14520, 13480],
              [15983, 15927, 16180, ..., 14604, 14630, 14254]]], dtype=uint16)
    • swir16
      (time, y, x)
      uint16
      28398 28591 28442 ... 17175 16479
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[28398, 28591, 28442, ..., 30118, 29786, 29930],
              [28186, 28312, 28221, ..., 30032, 29642, 29664],
              [27996, 28111, 27998, ..., 29934, 29623, 29559],
              ...,
              [26532, 26787, 27321, ..., 28380, 28445, 28429],
              [26589, 26657, 27083, ..., 28320, 28394, 28388],
              [26866, 26650, 26779, ..., 28314, 28429, 28429]],
      
             [[25585, 25575, 25853, ..., 27726, 27537, 27653],
              [25560, 25737, 26064, ..., 27880, 28042, 28162],
              [25479, 25798, 26099, ..., 27908, 28165, 28640],
              ...,
              [27130, 27325, 27445, ..., 31175, 31137, 30369],
              [27198, 27411, 27499, ..., 31205, 31115, 30167],
              [27220, 27416, 27560, ..., 30964, 30899, 30002]],
      
             [[21116, 21673, 22074, ..., 16096, 17669, 17086],
              [21353, 21659, 21816, ..., 18212, 19505, 18243],
              [21713, 21765, 21869, ..., 20604, 21004, 18995],
              ...,
      ...
              ...,
              [27195, 27144, 27173, ..., 29233, 28733, 29158],
              [27270, 27144, 27104, ..., 29677, 28658, 29012],
              [27208, 27168, 27130, ..., 29842, 28746, 28930]],
      
             [[25201, 25042, 24912, ..., 27466, 27819, 28041],
              [25306, 25195, 25144, ..., 27753, 27978, 28168],
              [25311, 25255, 25276, ..., 27918, 27932, 28060],
              ...,
              [25740, 25893, 25977, ..., 19832, 19464, 19251],
              [25711, 25906, 25999, ..., 20070, 19858, 20683],
              [25561, 25819, 25995, ..., 21233, 21480, 22485]],
      
             [[ 7684,  7888,  7857, ..., 14840, 15341, 16171],
              [ 7623,  7915,  7891, ..., 14468, 14472, 15190],
              [ 7701,  7934,  7943, ..., 13984, 13812, 14275],
              ...,
              [15317, 15883, 16667, ..., 17774, 17137, 16249],
              [16749, 16633, 16853, ..., 18817, 17565, 16250],
              [16369, 16634, 17149, ..., 17902, 17175, 16479]]], dtype=uint16)
    • swir22
      (time, y, x)
      uint16
      22729 22865 22740 ... 16075 15613
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[22729, 22865, 22740, ..., 23923, 23630, 23752],
              [22547, 22652, 22581, ..., 23855, 23479, 23526],
              [22403, 22502, 22400, ..., 23803, 23465, 23435],
              ...,
              [19875, 20162, 20651, ..., 21104, 21147, 21137],
              [19995, 20112, 20494, ..., 21035, 21088, 21086],
              [20260, 20105, 20246, ..., 21030, 21133, 21130]],
      
             [[23087, 23047, 23386, ..., 24706, 24494, 24657],
              [23113, 23248, 23603, ..., 24686, 24841, 24981],
              [22987, 23298, 23686, ..., 24578, 24823, 25309],
              ...,
              [21674, 21851, 21986, ..., 26707, 26704, 25920],
              [21728, 21932, 22043, ..., 26733, 26609, 25669],
              [21733, 21950, 22121, ..., 26463, 26376, 25570]],
      
             [[18834, 19252, 19536, ..., 14340, 15512, 15108],
              [18976, 19218, 19386, ..., 16206, 17182, 16089],
              [19309, 19368, 19461, ..., 18452, 18689, 16816],
              ...,
      ...
              ...,
              [19865, 19863, 19945, ..., 24200, 23640, 24208],
              [19953, 19869, 19893, ..., 24592, 23464, 23959],
              [19911, 19911, 19909, ..., 24582, 23516, 23859]],
      
             [[22663, 22529, 22469, ..., 24466, 24825, 25005],
              [22687, 22592, 22582, ..., 24670, 24874, 25056],
              [22648, 22628, 22672, ..., 24729, 24750, 24913],
              ...,
              [20498, 20591, 20677, ..., 18207, 17885, 17736],
              [20509, 20655, 20684, ..., 18534, 18497, 19309],
              [20409, 20634, 20726, ..., 19764, 20128, 21186]],
      
             [[ 7699,  7868,  7829, ..., 13641, 14192, 14936],
              [ 7650,  7887,  7858, ..., 13301, 13382, 13984],
              [ 7714,  7902,  7900, ..., 12764, 12745, 13200],
              ...,
              [14362, 14604, 15240, ..., 16377, 15933, 15382],
              [15570, 15444, 15873, ..., 17364, 16321, 15362],
              [15439, 15811, 16577, ..., 16694, 16075, 15613]]], dtype=uint16)
    • qa_pixel
      (time, y, x)
      uint16
      22280 22280 22280 ... 21824 21824
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'snow': {'bits': 5, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'clear': {'bits': 6, 'values': {'0': 'not_clear', '1': 'clear'}}, 'cloud': {'bits': 3, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'water': {'bits': 7, 'values': {'0': 'land_or_cloud', '1': 'water'}}, 'cirrus': {'bits': 2, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'nodata': {'bits': 0, 'values': {'0': False, '1': True}}, 'qa_pixel': {'bits': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15], 'values': {'1': 'Fill', '2': 'Dilated Cloud', '4': 'Cirrus', '8': 'Cloud', '16': 'Cloud Shadow', '32': 'Snow', '64': 'Clear', '128': 'Water', '256': 'Cloud Confidence low bit', '512': 'Cloud Confidence high bit', '1024': 'Cloud Shadow Confidence low bit', '2048': 'Cloud Shadow Confidence high bit', '4096': 'Snow Ice Confidence low bit', '8192': 'Snow Ice Confidence high bit', '16384': 'Cirrus Confidence low bit', '32768': 'Cirrus Confidence high bit'}, 'description': 'Level 2 pixel quality'}, 'cloud_shadow': {'bits': 4, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'dilated_cloud': {'bits': 1, 'values': {'0': 'not_dilated', '1': 'dilated'}}, 'cloud_confidence': {'bits': [8, 9], 'values': {'0': 'none', '1': 'low', '2': 'medium', '3': 'high'}}, 'cirrus_confidence': {'bits': [14, 15], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}, 'snow_ice_confidence': {'bits': [12, 13], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}, 'cloud_shadow_confidence': {'bits': [10, 11], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              ...,
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280]],
      
             [[22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              ...,
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280]],
      
             [[22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              ...,
      ...
              ...,
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280]],
      
             [[22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              ...,
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280]],
      
             [[21952, 21952, 21952, ..., 23826, 23826, 22280],
              [21952, 21952, 21952, ..., 23826, 23826, 23826],
              [21952, 21952, 21952, ..., 23826, 23826, 23826],
              ...,
              [22280, 23826, 23826, ..., 21824, 21824, 21824],
              [23826, 23826, 23826, ..., 21824, 21824, 21824],
              [23826, 23826, 21762, ..., 21824, 21824, 21824]]], dtype=uint16)
    • qa_aerosol
      (time, y, x)
      uint8
      224 206 220 224 ... 160 106 96 96
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'water': {'bits': 2, 'values': {'0': 'not_water', '1': 'water'}}, 'nodata': {'bits': 0, 'values': {'0': False, '1': True}}, 'qa_aerosol': {'bits': [0, 1, 2, 3, 4, 5, 6, 7], 'values': {'1': 'Fill', '2': 'Valid aerosol retrieval', '4': 'Water', '8': 'Unused', '16': 'Unused', '32': 'Interpolated Aerosol', '64': 'Aerosol Level low bit', '128': 'Aerosol Level high bit'}, 'description': 'Aerosol quality assessment'}, 'aerosol_level': {'bits': [6, 7], 'values': {'0': 'climatology', '1': 'low', '2': 'medium', '3': 'high'}}, 'valid_retrieval': {'bits': 1, 'values': {'0': 'not_valid', '1': 'valid'}}, 'interp_retrieval': {'bits': 5, 'values': {'0': 'not_aerosol_interpolated', '1': 'aerosol_interpolated'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[224, 206, 220, ..., 208, 224, 223],
              [224, 216, 222, ..., 224, 224, 224],
              [224, 224, 224, ..., 212, 224, 222],
              ...,
              [224, 224, 224, ..., 224, 213, 220],
              [212, 224, 209, ..., 224, 211, 220],
              [223, 224, 219, ..., 224, 224, 224]],
      
             [[223, 224, 215, ..., 224, 224, 224],
              [224, 224, 224, ..., 222, 212, 224],
              [222, 224, 205, ..., 221, 208, 224],
              ...,
              [208, 212, 224, ..., 220, 224, 210],
              [222, 222, 224, ..., 224, 224, 224],
              [224, 224, 224, ..., 221, 224, 213]],
      
             [[103, 151, 161, ..., 223, 207, 224],
              [137, 157, 151, ..., 224, 224, 224],
              [159, 160, 160, ..., 222, 214, 224],
              ...,
      ...
              ...,
              [224, 224, 224, ..., 220, 224, 214],
              [207, 213, 224, ..., 221, 224, 209],
              [222, 222, 224, ..., 224, 224, 224]],
      
             [[219, 224, 207, ..., 223, 206, 224],
              [223, 224, 215, ..., 224, 224, 224],
              [224, 224, 224, ..., 222, 213, 224],
              ...,
              [224, 224, 224, ..., 220, 224, 214],
              [207, 213, 224, ..., 221, 224, 189],
              [222, 222, 224, ..., 196, 194, 166]],
      
             [[205, 221, 225, ..., 224, 223, 205],
              [218, 223, 224, ..., 224, 224, 224],
              [224, 224, 224, ..., 195, 197, 197],
              ...,
              [224, 224, 224, ...,  93,  92,  96],
              [224, 209, 214, ...,  98,  93,  96],
              [224, 221, 222, ..., 106,  96,  96]]], dtype=uint8)
    • qa_radsat
      (time, y, x)
      uint16
      0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'qa_radsat': {'bits': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], 'values': {'1': 'Band 1 Data Saturation', '2': 'Band 2 Data Saturation', '4': 'Band 3 Data Saturation', '8': 'Band 4 Data Saturation', '16': 'Band 5 Data Saturation', '32': 'Band 6 Data Saturation', '64': 'Band 7 Data Saturation', '128': 'Unused', '256': 'Band 9 Data Saturation', '512': 'Unused', '1024': 'Unused', '2048': 'Terrain occlusion'}, 'description': 'Radiometric saturation'}, 'b1_saturation': {'bits': 0, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b2_saturation': {'bits': 1, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b3_saturation': {'bits': 2, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b4_saturation': {'bits': 3, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b5_saturation': {'bits': 4, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b6_saturation': {'bits': 5, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b7_saturation': {'bits': 6, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b9_saturation': {'bits': 8, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'terrain_occlusion': {'bits': 11, 'values': {'0': 'no_terrain_occlusion', '1': 'terrain_occlusion'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]],
      
             [[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]],
      
             [[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
      ...
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]],
      
             [[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]],
      
             [[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]]], dtype=uint16)
    • time
      PandasIndex
      PandasIndex(DatetimeIndex(['2021-01-03 14:39:19.317361', '2021-01-19 14:39:11.996195',
                     '2021-02-04 14:39:10.642566', '2021-02-20 14:39:06.231918',
                     '2021-03-08 14:38:58.547361', '2021-03-24 14:38:51.737798',
                     '2021-04-09 14:38:47.161063', '2021-04-25 14:38:39.521240',
                     '2021-05-11 14:38:36.106063', '2021-05-27 14:38:45.945207',
                     '2021-06-12 14:38:52.799593', '2021-06-28 14:38:56.818329',
                     '2021-07-14 14:38:58.020893', '2021-07-30 14:39:06.209510',
                     '2021-08-31 14:39:16.967976', '2021-09-16 14:39:20.745598',
                     '2021-10-02 14:39:25.506402', '2021-10-18 14:39:29.217991',
                     '2021-11-03 14:39:28.562838', '2021-11-19 14:39:23.470160',
                     '2021-12-05 14:39:25.260483', '2021-12-21 14:39:22.530622'],
                    dtype='datetime64[ns]', name='time', freq=None))
    • y
      PandasIndex
      PandasIndex(Float64Index([6692085.0, 6692055.0, 6692025.0, 6691995.0, 6691965.0, 6691935.0,
                    6691905.0, 6691875.0, 6691845.0, 6691815.0,
                    ...
                    6680955.0, 6680925.0, 6680895.0, 6680865.0, 6680835.0, 6680805.0,
                    6680775.0, 6680745.0, 6680715.0, 6680685.0],
                   dtype='float64', name='y', length=381))
    • x
      PandasIndex
      PandasIndex(Float64Index([857175.0, 857205.0, 857235.0, 857265.0, 857295.0, 857325.0,
                    857355.0, 857385.0, 857415.0, 857445.0,
                    ...
                    866925.0, 866955.0, 866985.0, 867015.0, 867045.0, 867075.0,
                    867105.0, 867135.0, 867165.0, 867195.0],
                   dtype='float64', name='x', length=335))
  • crs :
    epsg:32718
    grid_mapping :
    spatial_ref

How fast this step is will depend on how many cores are in your Jupyter notebook's local cluster. In real world scenarios, an 8-core cluster the datacube.load() this may take between 1/4 or 1/6 of the time compared to without dask depending on many factors. This is great!

Why not 1/8 of the time?

Dask has overheads, and datacube.load() itself is IO limited. There are all sorts of things that result in limits and part of the art of parallel computing is tuning your algorithm to reduce the impact of these and achieve greater performnance. As we scale up this example we'll explore some of these.

Tip: recent updates to Dask have greatly improved performance and we are now seeing more substantial performance gains, more in line with the increase in cores.

Do not always expect 8x as many cores to produce 8x the speed up. Algorithms can be tuned to perform better (or worse) as scale increases. This is part of the art of parallel programming. Dask does it's best, and you can often do better.

Exploiting delayed tasks¶

Now let's repeat the full example, with NDVI calculation and masking, but this time with dask and compute to load the data in.

First the dc.load()...

In [26]:
chunks = {"time":1}

This time, we will run the .compute() step straight away, resulting in real numbers being returned from Dask.

In [27]:
%%time
dataset = None # clear results from any previous runs
dataset = dc.load(
            product=product,
            x=study_area_lon,
            y=study_area_lat,
            time=set_time,
            measurements=measurements,
            resampling={"qa_pixel": "nearest", "*": "average"},
            output_crs=set_crs,
            resolution=set_resolution,
            dask_chunks = chunks,
            group_by=group_by,
        )
actual_dataset = dataset.compute()
CPU times: user 499 ms, sys: 81.9 ms, total: 581 ms
Wall time: 3.95 s
In [28]:
actual_dataset
Out[28]:
<xarray.Dataset>
Dimensions:      (time: 22, y: 381, x: 335)
Coordinates:
  * time         (time) datetime64[ns] 2021-01-03T14:39:19.317361 ... 2021-12...
  * y            (y) float64 6.692e+06 6.692e+06 ... 6.681e+06 6.681e+06
  * x            (x) float64 8.572e+05 8.572e+05 ... 8.672e+05 8.672e+05
    spatial_ref  int32 32718
Data variables:
    coastal      (time, y, x) uint16 38342 38311 37994 37453 ... 9836 9832 9814
    blue         (time, y, x) uint16 38117 38116 37811 ... 10336 10392 10318
    green        (time, y, x) uint16 36410 36469 36214 ... 11842 11640 11518
    red          (time, y, x) uint16 36244 36332 36158 ... 13181 12926 12585
    nir08        (time, y, x) uint16 35466 35587 35444 ... 14604 14630 14254
    swir16       (time, y, x) uint16 28398 28591 28442 ... 17902 17175 16479
    swir22       (time, y, x) uint16 22729 22865 22740 ... 16694 16075 15613
    qa_pixel     (time, y, x) uint16 22280 22280 22280 ... 21824 21824 21824
    qa_aerosol   (time, y, x) uint8 224 206 220 224 210 ... 166 160 106 96 96
    qa_radsat    (time, y, x) uint16 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
Attributes:
    crs:           epsg:32718
    grid_mapping:  spatial_ref
xarray.Dataset
    • time: 22
    • y: 381
    • x: 335
    • time
      (time)
      datetime64[ns]
      2021-01-03T14:39:19.317361 ... 2...
      units :
      seconds since 1970-01-01 00:00:00
      array(['2021-01-03T14:39:19.317361000', '2021-01-19T14:39:11.996195000',
             '2021-02-04T14:39:10.642566000', '2021-02-20T14:39:06.231918000',
             '2021-03-08T14:38:58.547361000', '2021-03-24T14:38:51.737798000',
             '2021-04-09T14:38:47.161063000', '2021-04-25T14:38:39.521240000',
             '2021-05-11T14:38:36.106063000', '2021-05-27T14:38:45.945207000',
             '2021-06-12T14:38:52.799593000', '2021-06-28T14:38:56.818329000',
             '2021-07-14T14:38:58.020893000', '2021-07-30T14:39:06.209510000',
             '2021-08-31T14:39:16.967976000', '2021-09-16T14:39:20.745598000',
             '2021-10-02T14:39:25.506402000', '2021-10-18T14:39:29.217991000',
             '2021-11-03T14:39:28.562838000', '2021-11-19T14:39:23.470160000',
             '2021-12-05T14:39:25.260483000', '2021-12-21T14:39:22.530622000'],
            dtype='datetime64[ns]')
    • y
      (y)
      float64
      6.692e+06 6.692e+06 ... 6.681e+06
      units :
      metre
      resolution :
      -30.0
      crs :
      epsg:32718
      array([6692085., 6692055., 6692025., ..., 6680745., 6680715., 6680685.])
    • x
      (x)
      float64
      8.572e+05 8.572e+05 ... 8.672e+05
      units :
      metre
      resolution :
      30.0
      crs :
      epsg:32718
      array([857175., 857205., 857235., ..., 867135., 867165., 867195.])
    • spatial_ref
      ()
      int32
      32718
      spatial_ref :
      PROJCS["WGS 84 / UTM zone 18S",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-75],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",10000000],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","32718"]]
      grid_mapping_name :
      transverse_mercator
      array(32718, dtype=int32)
    • coastal
      (time, y, x)
      uint16
      38342 38311 37994 ... 9832 9814
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[38342, 38311, 37994, ..., 44083, 43955, 43934],
              [38034, 37992, 37796, ..., 44049, 43865, 43804],
              [37705, 37644, 37454, ..., 44028, 43837, 43745],
              ...,
              [39221, 39599, 40046, ..., 43182, 43265, 43277],
              [39230, 39480, 39864, ..., 43105, 43200, 43240],
              [39448, 39398, 39575, ..., 43075, 43200, 43245]],
      
             [[27076, 26975, 27310, ..., 25533, 24978, 24973],
              [27157, 27323, 27590, ..., 26099, 26176, 26102],
              [27120, 27594, 27725, ..., 26409, 26891, 27563],
              ...,
              [33571, 33770, 33852, ..., 35482, 35125, 34172],
              [33678, 33867, 33903, ..., 35285, 34891, 33944],
              [33660, 33857, 33919, ..., 34770, 34488, 33868]],
      
             [[22430, 23160, 23876, ...,  7597,  8801,  8342],
              [22561, 22890, 23277, ...,  7435,  9220,  8240],
              [22800, 23003, 23224, ..., 10056, 10611,  7981],
              ...,
      ...
              ...,
              [40494, 40391, 40363, ..., 38437, 38234, 38675],
              [40572, 40452, 40365, ..., 38796, 38274, 38645],
              [40524, 40472, 40389, ..., 38892, 38439, 38663]],
      
             [[27891, 27549, 27452, ..., 27017, 27318, 27313],
              [28239, 28003, 27994, ..., 27182, 27370, 27468],
              [28520, 28420, 28530, ..., 27367, 27477, 27659],
              ...,
              [32798, 32936, 32995, ..., 13854, 13077, 12826],
              [32719, 32817, 32796, ..., 14677, 14220, 15676],
              [32457, 32784, 32884, ..., 16920, 17017, 18179]],
      
             [[ 7442,  7587,  7582, ...,  8433,  8483,  8972],
              [ 7443,  7609,  7562, ...,  8409,  8387,  8739],
              [ 7462,  7632,  7596, ...,  8375,  8239,  8454],
              ...,
              [ 9772, 10100, 10620, ...,  9249,  9296,  9214],
              [10608, 10539, 10712, ..., 10035,  9847,  9545],
              [10435, 10416, 10757, ...,  9836,  9832,  9814]]], dtype=uint16)
    • blue
      (time, y, x)
      uint16
      38117 38116 37811 ... 10392 10318
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[38117, 38116, 37811, ..., 44195, 44015, 43985],
              [37834, 37739, 37606, ..., 44154, 43935, 43845],
              [37501, 37389, 37268, ..., 44109, 43910, 43793],
              ...,
              [39101, 39429, 39853, ..., 43080, 43194, 43213],
              [39131, 39336, 39679, ..., 43039, 43139, 43150],
              [39295, 39238, 39424, ..., 43014, 43109, 43154]],
      
             [[27168, 27101, 27438, ..., 25928, 25379, 25442],
              [27305, 27505, 27710, ..., 26476, 26598, 26593],
              [27300, 27777, 27837, ..., 26754, 27253, 27887],
              ...,
              [33676, 33864, 33973, ..., 35803, 35529, 34526],
              [33757, 33941, 34012, ..., 35601, 35243, 34280],
              [33756, 33950, 34037, ..., 35154, 34816, 34224]],
      
             [[22372, 23106, 23847, ...,  8739,  9908,  9477],
              [22346, 22689, 23262, ...,  8539, 10370,  9571],
              [22770, 22982, 23282, ..., 10638, 11391,  9186],
              ...,
      ...
              ...,
              [40649, 40525, 40472, ..., 38548, 38228, 38640],
              [40698, 40530, 40427, ..., 39013, 38400, 38638],
              [40619, 40552, 40477, ..., 39063, 38525, 38720]],
      
             [[27883, 27516, 27419, ..., 27262, 27540, 27496],
              [28258, 27992, 27933, ..., 27375, 27510, 27662],
              [28512, 28341, 28389, ..., 27546, 27623, 27823],
              ...,
              [32777, 32921, 33019, ..., 14052, 13155, 12833],
              [32716, 32814, 32816, ..., 14774, 14331, 15794],
              [32469, 32766, 32885, ..., 17086, 17268, 18448]],
      
             [[ 7595,  7737,  7723, ...,  9038,  9145,  9752],
              [ 7578,  7745,  7688, ...,  8938,  8900,  9341],
              [ 7598,  7768,  7749, ...,  8826,  8725,  8945],
              ...,
              [10741, 11074, 11619, ...,  9825,  9719,  9561],
              [11657, 11585, 11661, ..., 10663, 10340,  9900],
              [11445, 11325, 11708, ..., 10336, 10392, 10318]]], dtype=uint16)
    • green
      (time, y, x)
      uint16
      36410 36469 36214 ... 11640 11518
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[36410, 36469, 36214, ..., 42603, 42349, 42321],
              [36123, 36094, 35915, ..., 42559, 42260, 42164],
              [35816, 35769, 35624, ..., 42484, 42263, 42119],
              ...,
              [37460, 37698, 38115, ..., 41499, 41619, 41638],
              [37501, 37594, 37906, ..., 41446, 41559, 41580],
              [37688, 37538, 37694, ..., 41419, 41545, 41591]],
      
             [[26880, 26742, 27191, ..., 26317, 25813, 25896],
              [27010, 27152, 27463, ..., 26746, 26854, 26917],
              [26911, 27361, 27542, ..., 27032, 27426, 28019],
              ...,
              [32855, 33040, 33127, ..., 35255, 35011, 34073],
              [32932, 33126, 33171, ..., 35121, 34809, 33779],
              [32954, 33120, 33199, ..., 34639, 34448, 33617]],
      
             [[21965, 22629, 23283, ..., 11160, 12365, 11911],
              [22171, 22579, 22940, ..., 12091, 13429, 12904],
              [22658, 22750, 22980, ..., 13804, 14210, 12850],
              ...,
      ...
              ...,
              [39407, 39236, 39170, ..., 37291, 36989, 37393],
              [39366, 39209, 39091, ..., 37847, 37163, 37435],
              [39254, 39214, 39120, ..., 38019, 37290, 37482]],
      
             [[27241, 26945, 26794, ..., 27167, 27412, 27497],
              [27532, 27349, 27264, ..., 27347, 27502, 27661],
              [27743, 27616, 27676, ..., 27518, 27562, 27820],
              ...,
              [31765, 31918, 32035, ..., 15309, 14685, 14379],
              [31685, 31843, 31894, ..., 15810, 15522, 16681],
              [31427, 31784, 31960, ..., 17692, 18021, 19118]],
      
             [[ 7736,  7941,  7892, ..., 10526, 10736, 11328],
              [ 7671,  7949,  7898, ..., 10363, 10295, 10738],
              [ 7715,  7950,  7940, ..., 10119,  9943, 10234],
              ...,
              [12354, 12793, 13603, ..., 11697, 11258, 10662],
              [13889, 13729, 13698, ..., 12433, 11739, 11010],
              [13353, 13147, 13381, ..., 11842, 11640, 11518]]], dtype=uint16)
    • red
      (time, y, x)
      uint16
      36244 36332 36158 ... 12926 12585
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[36244, 36332, 36158, ..., 42967, 42709, 42677],
              [35998, 35979, 35849, ..., 42915, 42607, 42484],
              [35682, 35681, 35549, ..., 42836, 42604, 42447],
              ...,
              [37428, 37620, 37992, ..., 41429, 41547, 41576],
              [37429, 37471, 37771, ..., 41378, 41484, 41530],
              [37592, 37433, 37576, ..., 41362, 41495, 41545]],
      
             [[27213, 27095, 27458, ..., 27127, 26703, 26761],
              [27234, 27456, 27736, ..., 27544, 27656, 27706],
              [27176, 27612, 27821, ..., 27831, 28215, 28790],
              ...,
              [33025, 33207, 33306, ..., 35669, 35493, 34558],
              [33113, 33302, 33345, ..., 35562, 35296, 34287],
              [33108, 33301, 33399, ..., 35095, 34913, 34089]],
      
             [[22172, 22821, 23567, ..., 12047, 13341, 12884],
              [22471, 22814, 23173, ..., 13060, 14513, 13759],
              [22871, 22966, 23204, ..., 15070, 15551, 13813],
              ...,
      ...
              ...,
              [39517, 39335, 39280, ..., 37443, 37083, 37425],
              [39477, 39303, 39177, ..., 38046, 37320, 37467],
              [39367, 39325, 39212, ..., 38208, 37458, 37509]],
      
             [[27331, 27030, 26943, ..., 27857, 28134, 28228],
              [27621, 27446, 27380, ..., 28047, 28222, 28404],
              [27805, 27707, 27759, ..., 28244, 28278, 28527],
              ...,
              [31877, 32021, 32172, ..., 16694, 16047, 15613],
              [31821, 31970, 32045, ..., 17214, 16910, 17848],
              [31564, 31911, 32107, ..., 18917, 19166, 20199]],
      
             [[ 7419,  7624,  7592, ..., 11435, 11673, 12374],
              [ 7382,  7641,  7612, ..., 11287, 11148, 11619],
              [ 7438,  7670,  7660, ..., 11009, 10734, 10992],
              ...,
              [13190, 13732, 14626, ..., 12881, 12395, 11698],
              [15063, 14820, 14769, ..., 13878, 13039, 12081],
              [14407, 14330, 14429, ..., 13181, 12926, 12585]]], dtype=uint16)
    • nir08
      (time, y, x)
      uint16
      35466 35587 35444 ... 14630 14254
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[35466, 35587, 35444, ..., 42378, 42091, 42058],
              [35230, 35267, 35165, ..., 42307, 41987, 41873],
              [35002, 35009, 34904, ..., 42219, 41975, 41824],
              ...,
              [36732, 36868, 37218, ..., 40664, 40748, 40799],
              [36761, 36768, 37030, ..., 40594, 40696, 40692],
              [36954, 36731, 36816, ..., 40546, 40676, 40735]],
      
             [[27594, 27506, 27772, ..., 28259, 27895, 27930],
              [27587, 27791, 28031, ..., 28625, 28733, 28775],
              [27553, 27958, 28127, ..., 28833, 29154, 29678],
              ...,
              [33236, 33432, 33540, ..., 35916, 35857, 34974],
              [33344, 33525, 33611, ..., 35847, 35671, 34724],
              [33376, 33562, 33693, ..., 35497, 35312, 34544]],
      
             [[22723, 23366, 24027, ..., 13848, 15247, 14713],
              [22982, 23226, 23594, ..., 15433, 16732, 15662],
              [23311, 23433, 23648, ..., 17551, 17903, 16019],
              ...,
      ...
              ...,
              [39312, 39144, 39070, ..., 37289, 36803, 37016],
              [39289, 39109, 38957, ..., 37971, 37137, 37100],
              [39126, 39097, 39011, ..., 38042, 37248, 37216]],
      
             [[27448, 27189, 27100, ..., 28570, 28855, 28957],
              [27715, 27515, 27468, ..., 28771, 28937, 29108],
              [27850, 27741, 27765, ..., 28952, 28982, 29155],
              ...,
              [31830, 31993, 32139, ..., 18652, 17989, 17672],
              [31745, 31956, 32101, ..., 18990, 18687, 19613],
              [31607, 31942, 32134, ..., 20570, 20826, 21736]],
      
             [[ 7411,  7621,  7587, ..., 12709, 13032, 13900],
              [ 7416,  7629,  7596, ..., 12528, 12372, 12974],
              [ 7467,  7665,  7659, ..., 12175, 11944, 12296],
              ...,
              [15364, 16694, 17527, ..., 14461, 13810, 13094],
              [16789, 16667, 16583, ..., 15325, 14520, 13480],
              [15983, 15927, 16180, ..., 14604, 14630, 14254]]], dtype=uint16)
    • swir16
      (time, y, x)
      uint16
      28398 28591 28442 ... 17175 16479
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[28398, 28591, 28442, ..., 30118, 29786, 29930],
              [28186, 28312, 28221, ..., 30032, 29642, 29664],
              [27996, 28111, 27998, ..., 29934, 29623, 29559],
              ...,
              [26532, 26787, 27321, ..., 28380, 28445, 28429],
              [26589, 26657, 27083, ..., 28320, 28394, 28388],
              [26866, 26650, 26779, ..., 28314, 28429, 28429]],
      
             [[25585, 25575, 25853, ..., 27726, 27537, 27653],
              [25560, 25737, 26064, ..., 27880, 28042, 28162],
              [25479, 25798, 26099, ..., 27908, 28165, 28640],
              ...,
              [27130, 27325, 27445, ..., 31175, 31137, 30369],
              [27198, 27411, 27499, ..., 31205, 31115, 30167],
              [27220, 27416, 27560, ..., 30964, 30899, 30002]],
      
             [[21116, 21673, 22074, ..., 16096, 17669, 17086],
              [21353, 21659, 21816, ..., 18212, 19505, 18243],
              [21713, 21765, 21869, ..., 20604, 21004, 18995],
              ...,
      ...
              ...,
              [27195, 27144, 27173, ..., 29233, 28733, 29158],
              [27270, 27144, 27104, ..., 29677, 28658, 29012],
              [27208, 27168, 27130, ..., 29842, 28746, 28930]],
      
             [[25201, 25042, 24912, ..., 27466, 27819, 28041],
              [25306, 25195, 25144, ..., 27753, 27978, 28168],
              [25311, 25255, 25276, ..., 27918, 27932, 28060],
              ...,
              [25740, 25893, 25977, ..., 19832, 19464, 19251],
              [25711, 25906, 25999, ..., 20070, 19858, 20683],
              [25561, 25819, 25995, ..., 21233, 21480, 22485]],
      
             [[ 7684,  7888,  7857, ..., 14840, 15341, 16171],
              [ 7623,  7915,  7891, ..., 14468, 14472, 15190],
              [ 7701,  7934,  7943, ..., 13984, 13812, 14275],
              ...,
              [15317, 15883, 16667, ..., 17774, 17137, 16249],
              [16749, 16633, 16853, ..., 18817, 17565, 16250],
              [16369, 16634, 17149, ..., 17902, 17175, 16479]]], dtype=uint16)
    • swir22
      (time, y, x)
      uint16
      22729 22865 22740 ... 16075 15613
      units :
      reflectance
      nodata :
      0
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[22729, 22865, 22740, ..., 23923, 23630, 23752],
              [22547, 22652, 22581, ..., 23855, 23479, 23526],
              [22403, 22502, 22400, ..., 23803, 23465, 23435],
              ...,
              [19875, 20162, 20651, ..., 21104, 21147, 21137],
              [19995, 20112, 20494, ..., 21035, 21088, 21086],
              [20260, 20105, 20246, ..., 21030, 21133, 21130]],
      
             [[23087, 23047, 23386, ..., 24706, 24494, 24657],
              [23113, 23248, 23603, ..., 24686, 24841, 24981],
              [22987, 23298, 23686, ..., 24578, 24823, 25309],
              ...,
              [21674, 21851, 21986, ..., 26707, 26704, 25920],
              [21728, 21932, 22043, ..., 26733, 26609, 25669],
              [21733, 21950, 22121, ..., 26463, 26376, 25570]],
      
             [[18834, 19252, 19536, ..., 14340, 15512, 15108],
              [18976, 19218, 19386, ..., 16206, 17182, 16089],
              [19309, 19368, 19461, ..., 18452, 18689, 16816],
              ...,
      ...
              ...,
              [19865, 19863, 19945, ..., 24200, 23640, 24208],
              [19953, 19869, 19893, ..., 24592, 23464, 23959],
              [19911, 19911, 19909, ..., 24582, 23516, 23859]],
      
             [[22663, 22529, 22469, ..., 24466, 24825, 25005],
              [22687, 22592, 22582, ..., 24670, 24874, 25056],
              [22648, 22628, 22672, ..., 24729, 24750, 24913],
              ...,
              [20498, 20591, 20677, ..., 18207, 17885, 17736],
              [20509, 20655, 20684, ..., 18534, 18497, 19309],
              [20409, 20634, 20726, ..., 19764, 20128, 21186]],
      
             [[ 7699,  7868,  7829, ..., 13641, 14192, 14936],
              [ 7650,  7887,  7858, ..., 13301, 13382, 13984],
              [ 7714,  7902,  7900, ..., 12764, 12745, 13200],
              ...,
              [14362, 14604, 15240, ..., 16377, 15933, 15382],
              [15570, 15444, 15873, ..., 17364, 16321, 15362],
              [15439, 15811, 16577, ..., 16694, 16075, 15613]]], dtype=uint16)
    • qa_pixel
      (time, y, x)
      uint16
      22280 22280 22280 ... 21824 21824
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'snow': {'bits': 5, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'clear': {'bits': 6, 'values': {'0': 'not_clear', '1': 'clear'}}, 'cloud': {'bits': 3, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'water': {'bits': 7, 'values': {'0': 'land_or_cloud', '1': 'water'}}, 'cirrus': {'bits': 2, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'nodata': {'bits': 0, 'values': {'0': False, '1': True}}, 'qa_pixel': {'bits': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15], 'values': {'1': 'Fill', '2': 'Dilated Cloud', '4': 'Cirrus', '8': 'Cloud', '16': 'Cloud Shadow', '32': 'Snow', '64': 'Clear', '128': 'Water', '256': 'Cloud Confidence low bit', '512': 'Cloud Confidence high bit', '1024': 'Cloud Shadow Confidence low bit', '2048': 'Cloud Shadow Confidence high bit', '4096': 'Snow Ice Confidence low bit', '8192': 'Snow Ice Confidence high bit', '16384': 'Cirrus Confidence low bit', '32768': 'Cirrus Confidence high bit'}, 'description': 'Level 2 pixel quality'}, 'cloud_shadow': {'bits': 4, 'values': {'0': 'not_high_confidence', '1': 'high_confidence'}}, 'dilated_cloud': {'bits': 1, 'values': {'0': 'not_dilated', '1': 'dilated'}}, 'cloud_confidence': {'bits': [8, 9], 'values': {'0': 'none', '1': 'low', '2': 'medium', '3': 'high'}}, 'cirrus_confidence': {'bits': [14, 15], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}, 'snow_ice_confidence': {'bits': [12, 13], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}, 'cloud_shadow_confidence': {'bits': [10, 11], 'values': {'0': 'none', '1': 'low', '2': 'reserved', '3': 'high'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              ...,
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280]],
      
             [[22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              ...,
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280]],
      
             [[22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              ...,
      ...
              ...,
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280]],
      
             [[22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              ...,
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280],
              [22280, 22280, 22280, ..., 22280, 22280, 22280]],
      
             [[21952, 21952, 21952, ..., 23826, 23826, 22280],
              [21952, 21952, 21952, ..., 23826, 23826, 23826],
              [21952, 21952, 21952, ..., 23826, 23826, 23826],
              ...,
              [22280, 23826, 23826, ..., 21824, 21824, 21824],
              [23826, 23826, 23826, ..., 21824, 21824, 21824],
              [23826, 23826, 21762, ..., 21824, 21824, 21824]]], dtype=uint16)
    • qa_aerosol
      (time, y, x)
      uint8
      224 206 220 224 ... 160 106 96 96
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'water': {'bits': 2, 'values': {'0': 'not_water', '1': 'water'}}, 'nodata': {'bits': 0, 'values': {'0': False, '1': True}}, 'qa_aerosol': {'bits': [0, 1, 2, 3, 4, 5, 6, 7], 'values': {'1': 'Fill', '2': 'Valid aerosol retrieval', '4': 'Water', '8': 'Unused', '16': 'Unused', '32': 'Interpolated Aerosol', '64': 'Aerosol Level low bit', '128': 'Aerosol Level high bit'}, 'description': 'Aerosol quality assessment'}, 'aerosol_level': {'bits': [6, 7], 'values': {'0': 'climatology', '1': 'low', '2': 'medium', '3': 'high'}}, 'valid_retrieval': {'bits': 1, 'values': {'0': 'not_valid', '1': 'valid'}}, 'interp_retrieval': {'bits': 5, 'values': {'0': 'not_aerosol_interpolated', '1': 'aerosol_interpolated'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[224, 206, 220, ..., 208, 224, 223],
              [224, 216, 222, ..., 224, 224, 224],
              [224, 224, 224, ..., 212, 224, 222],
              ...,
              [224, 224, 224, ..., 224, 213, 220],
              [212, 224, 209, ..., 224, 211, 220],
              [223, 224, 219, ..., 224, 224, 224]],
      
             [[223, 224, 215, ..., 224, 224, 224],
              [224, 224, 224, ..., 222, 212, 224],
              [222, 224, 205, ..., 221, 208, 224],
              ...,
              [208, 212, 224, ..., 220, 224, 210],
              [222, 222, 224, ..., 224, 224, 224],
              [224, 224, 224, ..., 221, 224, 213]],
      
             [[103, 151, 161, ..., 223, 207, 224],
              [137, 157, 151, ..., 224, 224, 224],
              [159, 160, 160, ..., 222, 214, 224],
              ...,
      ...
              ...,
              [224, 224, 224, ..., 220, 224, 214],
              [207, 213, 224, ..., 221, 224, 209],
              [222, 222, 224, ..., 224, 224, 224]],
      
             [[219, 224, 207, ..., 223, 206, 224],
              [223, 224, 215, ..., 224, 224, 224],
              [224, 224, 224, ..., 222, 213, 224],
              ...,
              [224, 224, 224, ..., 220, 224, 214],
              [207, 213, 224, ..., 221, 224, 189],
              [222, 222, 224, ..., 196, 194, 166]],
      
             [[205, 221, 225, ..., 224, 223, 205],
              [218, 223, 224, ..., 224, 224, 224],
              [224, 224, 224, ..., 195, 197, 197],
              ...,
              [224, 224, 224, ...,  93,  92,  96],
              [224, 209, 214, ...,  98,  93,  96],
              [224, 221, 222, ..., 106,  96,  96]]], dtype=uint8)
    • qa_radsat
      (time, y, x)
      uint16
      0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0
      units :
      bit_index
      nodata :
      1
      flags_definition :
      {'qa_radsat': {'bits': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], 'values': {'1': 'Band 1 Data Saturation', '2': 'Band 2 Data Saturation', '4': 'Band 3 Data Saturation', '8': 'Band 4 Data Saturation', '16': 'Band 5 Data Saturation', '32': 'Band 6 Data Saturation', '64': 'Band 7 Data Saturation', '128': 'Unused', '256': 'Band 9 Data Saturation', '512': 'Unused', '1024': 'Unused', '2048': 'Terrain occlusion'}, 'description': 'Radiometric saturation'}, 'b1_saturation': {'bits': 0, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b2_saturation': {'bits': 1, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b3_saturation': {'bits': 2, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b4_saturation': {'bits': 3, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b5_saturation': {'bits': 4, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b6_saturation': {'bits': 5, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b7_saturation': {'bits': 6, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'b9_saturation': {'bits': 8, 'values': {'0': 'no_saturation', '1': 'saturated_data'}}, 'terrain_occlusion': {'bits': 11, 'values': {'0': 'no_terrain_occlusion', '1': 'terrain_occlusion'}}}
      crs :
      epsg:32718
      grid_mapping :
      spatial_ref
      array([[[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]],
      
             [[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]],
      
             [[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
      ...
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]],
      
             [[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]],
      
             [[0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              ...,
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0],
              [0, 0, 0, ..., 0, 0, 0]]], dtype=uint16)
    • time
      PandasIndex
      PandasIndex(DatetimeIndex(['2021-01-03 14:39:19.317361', '2021-01-19 14:39:11.996195',
                     '2021-02-04 14:39:10.642566', '2021-02-20 14:39:06.231918',
                     '2021-03-08 14:38:58.547361', '2021-03-24 14:38:51.737798',
                     '2021-04-09 14:38:47.161063', '2021-04-25 14:38:39.521240',
                     '2021-05-11 14:38:36.106063', '2021-05-27 14:38:45.945207',
                     '2021-06-12 14:38:52.799593', '2021-06-28 14:38:56.818329',
                     '2021-07-14 14:38:58.020893', '2021-07-30 14:39:06.209510',
                     '2021-08-31 14:39:16.967976', '2021-09-16 14:39:20.745598',
                     '2021-10-02 14:39:25.506402', '2021-10-18 14:39:29.217991',
                     '2021-11-03 14:39:28.562838', '2021-11-19 14:39:23.470160',
                     '2021-12-05 14:39:25.260483', '2021-12-21 14:39:22.530622'],
                    dtype='datetime64[ns]', name='time', freq=None))
    • y
      PandasIndex
      PandasIndex(Float64Index([6692085.0, 6692055.0, 6692025.0, 6691995.0, 6691965.0, 6691935.0,
                    6691905.0, 6691875.0, 6691845.0, 6691815.0,
                    ...
                    6680955.0, 6680925.0, 6680895.0, 6680865.0, 6680835.0, 6680805.0,
                    6680775.0, 6680745.0, 6680715.0, 6680685.0],
                   dtype='float64', name='y', length=381))
    • x
      PandasIndex
      PandasIndex(Float64Index([857175.0, 857205.0, 857235.0, 857265.0, 857295.0, 857325.0,
                    857355.0, 857385.0, 857415.0, 857445.0,
                    ...
                    866925.0, 866955.0, 866985.0, 867015.0, 867045.0, 867075.0,
                    867105.0, 867135.0, 867165.0, 867195.0],
                   dtype='float64', name='x', length=335))
  • crs :
    epsg:32718
    grid_mapping :
    spatial_ref

Now use the actual_result to compute the NDVI for all observation times

In [29]:
# Identify pixels that don't have cloud, cloud shadow or water
cloud_free_mask = masking.make_mask(actual_dataset['qa_pixel'], **good_pixel_flags)

# Apply the mask
cloud_free = actual_dataset.where(cloud_free_mask)

# Calculate the components that make up the NDVI calculation
band_diff = cloud_free.nir08 - cloud_free.red
band_sum = cloud_free.nir08 + cloud_free.red
# Calculate NDVI and store it as a measurement in the original dataset ta da
ndvi = None
ndvi = band_diff / band_sum

This completed very quickly because most of the time is in the data load, the actual calculation is < 1 second.


Now let's repeat that entire load and NDVI calculation in a single cell and time it - this is just to get the total time for later comparison.

To ensure comparable timings, we will .restart() the Dask cluster. This makes sure that we aren't just seeing performance gains for data caching.

Note that this will show some Restarting worker warnings. That is ok and it is just telling you that each of the four workers in the cluster are restarting.

In [30]:
client.restart()
2023-05-24 15:45:33,401 - distributed.nanny - WARNING - Restarting worker
2023-05-24 15:45:33,402 - distributed.nanny - WARNING - Restarting worker
2023-05-24 15:45:33,418 - distributed.nanny - WARNING - Restarting worker
2023-05-24 15:45:33,420 - distributed.nanny - WARNING - Restarting worker
Out[30]:

Client

Client-c74f3414-fa49-11ed-99da-1eb1b782f397

Connection method: Cluster object Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:8787/status

Cluster Info

LocalCluster

d46c9864

Dashboard: http://127.0.0.1:8787/status Workers: 4
Total threads: 8 Total memory: 29.00 GiB
Status: running Using processes: True

Scheduler Info

Scheduler

Scheduler-04088dc5-463b-4f93-8651-13df22de11c5

Comm: tcp://127.0.0.1:36203 Workers: 4
Dashboard: http://127.0.0.1:8787/status Total threads: 8
Started: 1 minute ago Total memory: 29.00 GiB

Workers

Worker: 0

Comm: tcp://127.0.0.1:46273 Total threads: 2
Dashboard: http://127.0.0.1:34417/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:46815
Local directory: /tmp/dask-worker-space/worker-1djz1521

Worker: 1

Comm: tcp://127.0.0.1:33933 Total threads: 2
Dashboard: http://127.0.0.1:43363/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:38017
Local directory: /tmp/dask-worker-space/worker-qjo9izxw

Worker: 2

Comm: tcp://127.0.0.1:45613 Total threads: 2
Dashboard: http://127.0.0.1:39845/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:41837
Local directory: /tmp/dask-worker-space/worker-6dn9lszu

Worker: 3

Comm: tcp://127.0.0.1:40649 Total threads: 2
Dashboard: http://127.0.0.1:34351/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:36811
Local directory: /tmp/dask-worker-space/worker-3gief1m6
In [31]:
%%time
dataset = None # clear results from any previous runs
dataset = dc.load(
            product=product,
            x=study_area_lon,
            y=study_area_lat,
            time=set_time,
            measurements=measurements,
            resampling={"qa_pixel": "nearest", "*": "average"},
            output_crs=set_crs,
            resolution=set_resolution,
            dask_chunks = chunks, 
            group_by=group_by,
        )
actual_dataset = dataset.compute() ### Compute the dataset ###

# Identify pixels that don't have cloud, cloud shadow or water
cloud_free_mask = masking.make_mask(actual_dataset['qa_pixel'], **good_pixel_flags)

# Apply the mask
cloud_free = actual_dataset.where(cloud_free_mask)

# Calculate the components that make up the NDVI calculation
band_diff = cloud_free.nir08 - cloud_free.red
band_sum = cloud_free.nir08 + cloud_free.red
# Calculate NDVI and store it as a measurement in the original dataset ta da
ndvi = None
ndvi = band_diff / band_sum
/env/lib/python3.10/site-packages/rasterio/warp.py:344: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
  _reproject(
/env/lib/python3.10/site-packages/rasterio/warp.py:344: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
  _reproject(
CPU times: user 638 ms, sys: 160 ms, total: 798 ms
Wall time: 6.09 s

around 10 seconds (for an 8-core cluster) or so. We can do better...

Data and computational locality¶

When compute() is called dask not only executes all the tasks but it consolidates all the distributed chunks back into a normal array on the client machine - in this case the notebook's kernel. In the previous cell we have two variables that both refer to the data we are loading:

  1. dataset refers to the delayed version of the data. The delayed tasks and the chunks that make it up will be on the cluster
  2. actual_result refers to the actual array in the notebook kernel memory after execution of the tasks. The actual_result is a complete array in memory in the notebook kernel (__on the _client___).

So in the previous cell everything after the actual_dataset = dataset.compute() line is computed in the Jupyter kernel and doesn't use the dask cluster at all for computation.

If we shift the location of this compute() call we can perform more tasks in parallel on the dask cluster.

Tip: Locality is an important concept and applies to both data and computation

Now let's repeat the load and NDVI calculation but this time rather than compute() on the full dataset we'll run the compute at the cloud masking step (cloud_free = dataset.where(cloud__free_mask).compute()) so the masking operation can be performed in parallel. Let's see what the impact is...

In [32]:
client.restart()
2023-05-24 15:45:40,716 - distributed.nanny - WARNING - Restarting worker
2023-05-24 15:45:40,722 - distributed.nanny - WARNING - Restarting worker
2023-05-24 15:45:40,723 - distributed.nanny - WARNING - Restarting worker
2023-05-24 15:45:40,729 - distributed.nanny - WARNING - Restarting worker
Out[32]:

Client

Client-c74f3414-fa49-11ed-99da-1eb1b782f397

Connection method: Cluster object Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:8787/status

Cluster Info

LocalCluster

d46c9864

Dashboard: http://127.0.0.1:8787/status Workers: 4
Total threads: 8 Total memory: 29.00 GiB
Status: running Using processes: True

Scheduler Info

Scheduler

Scheduler-04088dc5-463b-4f93-8651-13df22de11c5

Comm: tcp://127.0.0.1:36203 Workers: 4
Dashboard: http://127.0.0.1:8787/status Total threads: 8
Started: 1 minute ago Total memory: 29.00 GiB

Workers

Worker: 0

Comm: tcp://127.0.0.1:41495 Total threads: 2
Dashboard: http://127.0.0.1:34263/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:46815
Local directory: /tmp/dask-worker-space/worker-e9gax7ql

Worker: 1

Comm: tcp://127.0.0.1:36847 Total threads: 2
Dashboard: http://127.0.0.1:36887/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:38017
Local directory: /tmp/dask-worker-space/worker-e415nr2v

Worker: 2

Comm: tcp://127.0.0.1:44629 Total threads: 2
Dashboard: http://127.0.0.1:41565/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:41837
Local directory: /tmp/dask-worker-space/worker-t13s2sz_

Worker: 3

Comm: tcp://127.0.0.1:45789 Total threads: 2
Dashboard: http://127.0.0.1:43339/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:36811
Local directory: /tmp/dask-worker-space/worker-3u38bodn
In [33]:
%%time
dataset = None # clear results from any previous runs
del dataset
dataset = dc.load(
            product=product,
            x=study_area_lon,
            y=study_area_lat,
            time=set_time,
            measurements=measurements,
            resampling={"qa_pixel": "nearest", "*": "average"},
            output_crs=set_crs,
            resolution=set_resolution,
            dask_chunks = chunks, 
            group_by=group_by,
        )

# Identify pixels that are either "valid", "water" or "snow"
cloud_free_mask = masking.make_mask(dataset['qa_pixel'], **good_pixel_flags)

# Apply the mask
cloud_free = dataset.where(cloud_free_mask).compute()    ### COMPUTE MOVED HERE ###

# Calculate the components that make up the NDVI calculation
band_diff = cloud_free.nir08 - cloud_free.red
band_sum = cloud_free.nir08 + cloud_free.red
# Calculate NDVI and store it as a measurement in the original dataset ta da
ndvi = None
ndvi = band_diff / band_sum
actual_ndvi = ndvi
CPU times: user 865 ms, sys: 213 ms, total: 1.08 s
Wall time: 5.1 s

Not that different, but still a second or so quicker. This isn't too surprising since the masking operation is pretty quick (it's all numpy) and the data load is the bulk of the processing.

Dask can see the entire task graph for both load and mask computation. As a result some of the computation can be performed concurrently with file IO, and CPUs are busier as a result, so it will be slightly faster in practice but with IO dominating we won't see much overall improvement.

Perhaps doing more of the calculation on the cluster will help. Let's also move ndvi.compute() so the entire calculation is done on the cluster and only the final result returned to the client.

In [34]:
client.restart()
2023-05-24 15:45:47,038 - distributed.nanny - WARNING - Restarting worker
2023-05-24 15:45:47,039 - distributed.nanny - WARNING - Restarting worker
2023-05-24 15:45:47,046 - distributed.nanny - WARNING - Restarting worker
2023-05-24 15:45:47,047 - distributed.nanny - WARNING - Restarting worker
Out[34]:

Client

Client-c74f3414-fa49-11ed-99da-1eb1b782f397

Connection method: Cluster object Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:8787/status

Cluster Info

LocalCluster

d46c9864

Dashboard: http://127.0.0.1:8787/status Workers: 4
Total threads: 8 Total memory: 29.00 GiB
Status: running Using processes: True

Scheduler Info

Scheduler

Scheduler-04088dc5-463b-4f93-8651-13df22de11c5

Comm: tcp://127.0.0.1:36203 Workers: 4
Dashboard: http://127.0.0.1:8787/status Total threads: 8
Started: 1 minute ago Total memory: 29.00 GiB

Workers

Worker: 0

Comm: tcp://127.0.0.1:36091 Total threads: 2
Dashboard: http://127.0.0.1:43037/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:46815
Local directory: /tmp/dask-worker-space/worker-hpk4383n

Worker: 1

Comm: tcp://127.0.0.1:46441 Total threads: 2
Dashboard: http://127.0.0.1:43027/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:38017
Local directory: /tmp/dask-worker-space/worker-5cu3wt7w

Worker: 2

Comm: tcp://127.0.0.1:36043 Total threads: 2
Dashboard: http://127.0.0.1:40669/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:41837
Local directory: /tmp/dask-worker-space/worker-r1lyx1pa

Worker: 3

Comm: tcp://127.0.0.1:35425 Total threads: 2
Dashboard: http://127.0.0.1:36971/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:36811
Local directory: /tmp/dask-worker-space/worker-om96vlj5
In [35]:
%%time
dataset = None # clear results from any previous runs
dataset = dc.load(
            product=product,
            x=study_area_lon,
            y=study_area_lat,
            time=set_time,
            measurements=measurements,
            resampling={"qa_pixel": "nearest", "*": "average"},
            output_crs=set_crs,
            resolution=set_resolution,
            dask_chunks = chunks, 
            group_by=group_by,
        )

# Identify pixels that don't have cloud, cloud shadow or water
cloud_free_mask = masking.make_mask(dataset['qa_pixel'], **good_pixel_flags)

# Apply the mask
cloud_free = dataset.where(cloud_free_mask)

# Calculate the components that make up the NDVI calculation
band_diff = cloud_free.nir08 - cloud_free.red
band_sum = cloud_free.nir08 + cloud_free.red
# Calculate NDVI and store it as a measurement in the original dataset ta da
ndvi = None
ndvi = band_diff / band_sum
actual_ndvi = ndvi.compute()    ### COMPUTE MOVED HERE ###
/env/lib/python3.10/site-packages/rasterio/warp.py:344: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
  _reproject(
CPU times: user 358 ms, sys: 50.1 ms, total: 409 ms
Wall time: 2.43 s

Now we are seeing a huge difference!

You may be thinking "Hold on a sec, the NDVI calculation is pretty quick in this example with such a small dataset, why such a big difference?" - and you'd be right. There is more going on.

Remember that dataset is a task graph with delayed tasks waiting to be executed when the result is required. In the example dataset, there are many data variables are available but only 3 are used to produce the ndvi (qa_pixel, red and nir08). As a result dask doesn't load the other variables and because computation time in this case is mostly IO related the execution time is a LOT faster.

Of course we can save dask the trouble of figuring this out on our behalf and only load() the measurements we need in the first place. Let's check that now, we should see a similar performance figure.

In [36]:
client.restart()
2023-05-24 15:45:50,702 - distributed.nanny - WARNING - Restarting worker
2023-05-24 15:45:50,708 - distributed.nanny - WARNING - Restarting worker
2023-05-24 15:45:50,728 - distributed.nanny - WARNING - Restarting worker
2023-05-24 15:45:50,756 - distributed.nanny - WARNING - Restarting worker
Out[36]:

Client

Client-c74f3414-fa49-11ed-99da-1eb1b782f397

Connection method: Cluster object Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:8787/status

Cluster Info

LocalCluster

d46c9864

Dashboard: http://127.0.0.1:8787/status Workers: 4
Total threads: 8 Total memory: 29.00 GiB
Status: running Using processes: True

Scheduler Info

Scheduler

Scheduler-04088dc5-463b-4f93-8651-13df22de11c5

Comm: tcp://127.0.0.1:36203 Workers: 4
Dashboard: http://127.0.0.1:8787/status Total threads: 8
Started: 2 minutes ago Total memory: 29.00 GiB

Workers

Worker: 0

Comm: tcp://127.0.0.1:38275 Total threads: 2
Dashboard: http://127.0.0.1:34859/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:46815
Local directory: /tmp/dask-worker-space/worker-bo12vhn8

Worker: 1

Comm: tcp://127.0.0.1:35107 Total threads: 2
Dashboard: http://127.0.0.1:43327/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:38017
Local directory: /tmp/dask-worker-space/worker-vjp_63fx

Worker: 2

Comm: tcp://127.0.0.1:38683 Total threads: 2
Dashboard: http://127.0.0.1:36619/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:41837
Local directory: /tmp/dask-worker-space/worker-03g44p_j

Worker: 3

Comm: tcp://127.0.0.1:41669 Total threads: 2
Dashboard: http://127.0.0.1:39273/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:36811
Local directory: /tmp/dask-worker-space/worker-yl1wwgv8
In [37]:
%%time
dataset = None # clear results from any previous runs
measurements = [ "qa_pixel", "red", "nir08"]
dataset = dc.load(
            product=product,
            x=study_area_lon,
            y=study_area_lat,
            time=set_time,
            measurements=measurements,
            resampling={"qa_pixel": "nearest", "*": "average"},
            output_crs=set_crs,
            resolution=set_resolution,
            dask_chunks = chunks, 
            group_by=group_by,
        )

# Identify pixels that don't have cloud, cloud shadow or water
cloud_free_mask = masking.make_mask(dataset['qa_pixel'], **good_pixel_flags)
# Apply the mask
cloud_free = dataset.where(cloud_free_mask)

# Calculate the components that make up the NDVI calculation
band_diff = cloud_free.nir08 - cloud_free.red
band_sum = cloud_free.nir08 + cloud_free.red
# Calculate NDVI and store it as a measurement in the original dataset ta da
ndvi = None
ndvi = band_diff / band_sum
actual_ndvi = ndvi.compute()
CPU times: user 302 ms, sys: 62.7 ms, total: 364 ms
Wall time: 2.44 s

Pretty similar as expected, but again, a slight improvement because now there are less overheads and a smaller task graph. Now it can pay to give dask a hand and not have the task graph cluttered with tasks you are not going to use. Still it's nice to see that dask can save you some time by only computing what is required when you need it.

A quick check on the task graph¶

For completeness we will take a look at the task graph for the full calculation, all the way to the NDVI result. Given the complexity of the full graph we'll simplify it to 2 time observations like we did when the task graph was introduced previously.

In [38]:
set_time = ("2021-01-01", "2021-01-14")
In [39]:
client.restart()
2023-05-24 15:45:54,469 - distributed.nanny - WARNING - Restarting worker
2023-05-24 15:45:54,470 - distributed.nanny - WARNING - Restarting worker
2023-05-24 15:45:54,477 - distributed.nanny - WARNING - Restarting worker
2023-05-24 15:45:54,478 - distributed.nanny - WARNING - Restarting worker
Out[39]:

Client

Client-c74f3414-fa49-11ed-99da-1eb1b782f397

Connection method: Cluster object Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:8787/status

Cluster Info

LocalCluster

d46c9864

Dashboard: http://127.0.0.1:8787/status Workers: 4
Total threads: 8 Total memory: 29.00 GiB
Status: running Using processes: True

Scheduler Info

Scheduler

Scheduler-04088dc5-463b-4f93-8651-13df22de11c5

Comm: tcp://127.0.0.1:36203 Workers: 4
Dashboard: http://127.0.0.1:8787/status Total threads: 8
Started: 2 minutes ago Total memory: 29.00 GiB

Workers

Worker: 0

Comm: tcp://127.0.0.1:35031 Total threads: 2
Dashboard: http://127.0.0.1:44305/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:46815
Local directory: /tmp/dask-worker-space/worker-oqcpdg27

Worker: 1

Comm: tcp://127.0.0.1:34785 Total threads: 2
Dashboard: http://127.0.0.1:33925/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:38017
Local directory: /tmp/dask-worker-space/worker-7q9snt43

Worker: 2

Comm: tcp://127.0.0.1:36445 Total threads: 2
Dashboard: http://127.0.0.1:46473/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:41837
Local directory: /tmp/dask-worker-space/worker-j8860mz9

Worker: 3

Comm: tcp://127.0.0.1:34291 Total threads: 2
Dashboard: http://127.0.0.1:38719/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:36811
Local directory: /tmp/dask-worker-space/worker-idz3fkuc
In [40]:
%%time
dataset = None # clear results from any previous runs
measurements = [ "qa_pixel", "red", "nir08"]
dataset = dc.load(
            product=product,
            x=study_area_lon,
            y=study_area_lat,
            time=set_time,
            measurements=measurements,
            resampling={"qa_pixel": "nearest", "*": "average"},
            output_crs=set_crs,
            resolution=set_resolution,
            dask_chunks = chunks, 
            group_by=group_by,
        )

# Identify pixels that don't have cloud, cloud shadow or water
cloud_free_mask = masking.make_mask(dataset['qa_pixel'], **good_pixel_flags)
# Apply the mask
cloud_free = dataset.where(cloud_free_mask)

# Calculate the components that make up the NDVI calculation
band_diff = cloud_free.nir08 - cloud_free.red
band_sum = cloud_free.nir08 + cloud_free.red
# Calculate NDVI and store it as a measurement in the original dataset ta da
ndvi = None
ndvi = band_diff / band_sum
CPU times: user 40.7 ms, sys: 8.63 ms, total: 49.4 ms
Wall time: 57 ms
In [41]:
ndvi.data.visualize()
Out[41]:

The computation flows from bottom to top in the task graph. You can see there are two main paths, one for each time (since the time chunk is length 1). You can also see the three data sources are loaded independently. After that it gets a little more difficult to follow but you can see qa_pixel being used to produce the mask (and_, eq). Then combined via the where function with other two datasets. Then finally the NDVI calculation - a sub, add and divide (truediv).

Dask has lots of internal optimizations that it uses to help identify the dependencies and parallel components of a task graph. Sometimes it will reorder or prune operations where possible to further optimise (for example, not loading data variables that aren't used in the NDVI calculation).

Tip: The task graph can be complex but it is a useful tool in understanding your algorithm and how it scales.

Be a good dask user - Clean up the cluster resources¶

In [42]:
client.close()

cluster.close()
In [ ]: